IETF-syslog messages
This section describes the format of a syslog message, according to the IETF-syslog protocol. A syslog message consists of the following parts:
The following is a sample syslog message (source: https://tools.ietf.org/html/rfc5424):
<34>1 2003-10-11T22:14:15.003Z mymachine.example.com su - ID47 - BOM'su root' failed for lonvick on /dev/pts/8
The message corresponds to the following format:
<priority>VERSION ISOTIMESTAMP HOSTNAME APPLICATION PID MESSAGEID STRUCTURED-DATA MSG
-
Facility is 4, severity is 2, so PRI is 34.
-
The VERSION is 1.
-
The message was created on 11 October 2003 at 10:14:15pm UTC, 3 milliseconds into the next second.
-
The message originated from a host that identifies itself as "mymachine.example.com".
-
The APP-NAME is "su" and the PROCID is unknown.
-
The MSGID is "ID47".
-
The MSG is "'su root' failed for lonvick...", encoded in UTF-8.
-
In this example, the encoding is defined by the BOM:
The byte order mark (BOM) is a Unicode character used to signal the byte-order of the message text.
-
There is no STRUCTURED-DATA present in the message, this is indicated by "-" in the STRUCTURED-DATA field.
The HEADER part of the message must be in plain ASCII format, the parameter values of the STRUCTURED-DATA part must be in UTF-8, while the MSG part should be in UTF-8. The different parts of the message are explained in the following sections.
The PRI message part
The PRI part of the syslog message (known as Priority value) represents the Facility and Severity of the message. Facility represents the part of the system sending the message, while severity marks its importance. The Priority value is calculated by first multiplying the Facility number by 8 and then adding the numerical value of the Severity. The possible facility and severity values are presented below.
NOTE: Facility codes may slightly vary between different platforms. The syslog-ng application accepts facility codes as numerical values as well.
Table 3: syslog Message Facilities
0 |
kernel messages |
1 |
user-level messages |
2 |
mail system |
3 |
system daemons |
4 |
security/authorization messages |
5 |
messages generated internally by syslogd |
6 |
line printer subsystem |
7 |
network news subsystem |
8 |
UUCP subsystem |
9 |
clock daemon |
10 |
security/authorization messages |
11 |
FTP daemon |
12 |
NTP subsystem |
13 |
log audit |
14 |
log alert |
15 |
clock daemon |
16-23 |
locally used facilities (local0-local7) |
The following table lists the severity values.
Table 4: syslog Message Severities
0 |
Emergency: system is unusable |
1 |
Alert: action must be taken immediately |
2 |
Critical: critical conditions |
3 |
Error: error conditions |
4 |
Warning: warning conditions |
5 |
Notice: normal but significant condition |
6 |
Informational: informational messages |
7 |
Debug: debug-level messages |
The HEADER message part
The HEADER part contains the following elements:
-
VERSION: Version number of the syslog protocol standard. Currently this can only be 1.
-
ISOTIMESTAMP: The time when the message was generated in the ISO 8601 compatible standard timestamp format (yyyy-mm-ddThh:mm:ss+-ZONE), for example: 2006-06-13T15:58:00.123+01:00.
-
HOSTNAME: The machine that originally sent the message.
-
APPLICATION: The device or application that generated the message
-
PID: The process name or process ID of the syslog application that sent the message. It is not necessarily the process ID of the application that generated the message.
-
MESSAGEID: The ID number of the message.
NOTE: The syslog-ng application supports other timestamp formats as well, like ISO, or the PIX extended format. The timestamp used in the IETF-syslog protocol is derived from RFC3339, which is based on ISO8601. For details, see the ts-format() option in Global options.
The syslog-ng PE application will truncate the following fields:
-
If APP-NAME is longer than 48 characters it will be truncated to 48 characters.
-
If PROC-ID is longer than 128 characters it will be truncated to 128 characters.
-
If MSGID is longer than 32 characters it will be truncated to 32 characters.
-
If HOSTNAME is longer than 255 characters it will be truncated to 255 characters.
The STRUCTURED-DATA message part
The STRUCTURED-DATA message part may contain meta- information about the syslog message, or application-specific information such as traffic counters or IP addresses. STRUCTURED-DATA consists of data blocks enclosed in brackets ([]). Every block includes the ID of the block, and one or more name=value pairs. The syslog-ng application automatically parses the STRUCTURED-DATA part of syslog messages, which can be referenced in macros (for details, see Macros of syslog-ng PE). An example STRUCTURED-DATA block looks like:
[exampleSDID@0 iut="3" eventSource="Application" eventID="1011"][examplePriority@0 class="high"]
The MSG message part
The MSG part contains the text of the message itself. The encoding of the text must be UTF-8 if theBOMcharacter is present in the message. If the message does not contain the BOM character, the encoding is treated as unknown. Usually messages arriving from legacy sources do not include the BOM character. CRLF characters will not be removed from the message.
Enterprise-wide message model (EWMM)
The following section describes the structure of log messages using the Enterprise-wide message model or EWMM message format.
The Enterprise-wide message model or EWMM allows you to deliver structured messages from the initial receiving syslog-ng component right up to the central log server, through any number of hops. It does not matter if you parse the messages on the client, on a relay, or on the central server, their structured results will be available where you store the messages. Optionally, you can also forward the original raw message as the first syslog-ng component in your infrastructure has received it, which is important if you want to forward a message for example, to a SIEM system. To make use of the enterprise-wide message model, you have to use the syslog-ng() destination on the sender side, and the default-network-drivers() source on the receiver side.
The following is a sample log message in EWMM format.
<13>1 2018-05-13T13:27:50.993+00:00 my-host @syslog-ng - - -
{"MESSAGE":"<34>Oct 11 22:14:15 mymachine su: 'su root' failed for username on
/dev/pts/8","HOST_FROM":"my-host","HOST":"my-host","FILE_NAME":"/tmp/in","._TAGS":".source.s_file"}
The message has the following parts:
-
The header of the complies with the RFC5424 message format, where the PROGRAM field is set to @syslog-ng, and the SDATA field is empty.
-
The MESSAGE part is in JSON format, and contains the actual message, as well as any name-value pairs that syslog-ng PE has attached to or extracted from the message. The ${._TAGS} field contains the identifier of the syslog-ng source that has originally received the message on the first syslog-ng node.
To send a message in EWMM format, you can use the syslog-ng() destination driver, or the format-ewmm() template function.
To receive a message in EWMM format, you can use the default-destination-drivers() source driver, or the ewmm-parser() parser.
Message representation in syslog-ng PE
When the syslog-ng PE application receives a message, it automatically parses the message. The syslog-ng PE application can automatically parse log messages that conform to the RFC3164 (BSD or legacy-syslog) or the RFC5424 (IETF-syslog) message formats. If syslog-ng PE cannot parse a message, it results in an error.
TIP: In case you need to relay messages that cannot be parsed without any modifications or changes, use the flags(no-parse) option in the source definition, and a template containing only the ${MSG} macro in the destination definition.
To parse non-syslog messages, for example, JSON, CSV, or other messages, you can use the built-in parsers of syslog-ng PE. For details, see parser: Parse and segment structured messages.
A parsed syslog message has the following parts:
-
Timestamps
Two timestamps are associated with every message: one is the timestamp contained within the message (that is, when the sender sent the message), the other is the time when syslog-ng PE has actually received the message.
-
Severity
The severity of the message.
-
Facility
The facility that sent the message.
-
Tags
Custom text labels added to the message that are mainly used for filtering. None of the current message transport protocols adds tags to the log messages. Tags can be added to the log message only within syslog-ng PE. The syslog-ng PE application automatically adds the id of the source as a tag to the incoming messages. Other tags can be added to the message by the pattern database, or using the tags() option of the source.
-
IP address of the sender
The IP address of the host that sent the message. Note that the IP address of the sender is a hard macro and cannot be modified within syslog-ng PE but the associated hostname can be modified, for example, using rewrite rules.
-
Hard macros
Hard macros contain data that is directly derived from the log message, for example, the ${MONTH} macro derives its value from the timestamp. The most important consideration with hard macros is that they are read-only, meaning they cannot be modified using rewrite rules or other means.
-
Soft macros
Soft macros (sometimes also called name-value pairs) are either built-in macros automatically generated from the log message (for example, ${HOST}), or custom user-created macros generated by using the syslog-ng pattern database or a CSV-parser. The SDATA fields of RFC5424-formatted log messages become soft macros as well. In contrast with hard macros, soft macros are writable and can be modified within syslog-ng PE, for example, using rewrite rules.
NOTE: It is also possible to set the value of built-in soft macros using parsers, for example, to set the ${HOST} macro from the message using a column of a CSV-parser.
The data extracted from the log messages using named pattern parsers in the pattern database are also soft macros.
Message size and encoding
Internally, syslog-ng PE represents every message as UTF-8. The maximal length of the log messages is limited by the log-msg-size() option: if a message is longer than this value, syslog-ng PE truncates the message at the location it reaches the log-msg-size() value, and discards the rest of the message.
When encoding is set in a source (using the encoding() option) and the message is longer (in bytes) than log-msg-size() in UTF-8 representation, syslog-ng PE splits the message at an undefined location (because the conversion between different encodings is not trivial).
Structuring macros, metadata, and other value-pairs
Available in syslog-ng PE 3.35.1 and later.
The syslog-ng PE application allows you to select and construct name-value pairs from any information already available about the log message, or extracted from the message itself. You can directly use this structured information, for example, in the following places:
When using value-pairs, there are three ways to specify which information (that is, macros or other name-value pairs) to include in the selection.
-
Select groups of macros using the scope() parameter, and optionally remove certain macros from the group using the exclude() parameter.
-
List specific macros to include using the key() parameter.
-
Define new name-value pairs to include using the pair() parameter.
These parameters are detailed in value-pairs().