Chat now with support
Chat with Support

syslog-ng Premium Edition 6.0.17 - Administration Guide

Preface Chapter 1. Introduction to syslog-ng Chapter 2. The concepts of syslog-ng Chapter 3. Installing syslog-ng Chapter 4. The syslog-ng PE quick-start guide Chapter 5. The syslog-ng PE configuration file Chapter 6. Collecting log messages — sources and source drivers Chapter 7. Sending and storing log messages — destinations and destination drivers Chapter 8. Routing messages: log paths, reliability, and filters Chapter 9. Global options of syslog-ng PE Chapter 10. TLS-encrypted message transfer Chapter 11. FIPS-compliant syslog-ng Chapter 12.  Reliable Log Transfer Protocol™ Chapter 13. Reliability and minimizing the loss of log messages Chapter 14. Manipulating messages Chapter 15. Parsing and segmenting structured messages Chapter 16. Processing message content with a pattern database Chapter 17. Statistics and metrics of syslog-ng Chapter 18. Multithreading and scaling in syslog-ng PE Chapter 19. Troubleshooting syslog-ng Chapter 20. Best practices and examples

Collecting messages from remote hosts using the BSD syslog protocol

NOTE:

The tcp(), tcp6(), udp(), and udp6() drivers are obsolete. Use the network() source and the network() destination instead. For details, see the section called “Collecting messages using the RFC3164 protocol (network() driver)” and the section called “Sending messages to a remote log server using the RFC3164 protocol (network() driver)”, respectively.

The tcp(), tcp6(), udp(), udp6() drivers can receive syslog messages conforming to RFC3164 from the network using the TCP and UDP networking protocols. The tcp6() and udp6() drivers use the IPv6 network protocol, while tcp() and udp() use IPv4.

To convert your existing tcp(), tcp6(), udp(), udp6() source drivers to use the network() driver, see Procedure 6.1, “Change an old source driver to the network() driver”.

tcp(), tcp6(), udp() and udp6() source options — OBSOLETE

NOTE:

The tcp(), tcp6(), udp(), and udp6() drivers are obsolete. Use the network() source and the network() destination instead. For details, see the section called “Collecting messages using the RFC3164 protocol (network() driver)” and the section called “Sending messages to a remote log server using the RFC3164 protocol (network() driver)”, respectively.

To convert your existing tcp(), tcp6(), udp(), udp6() source drivers to use the network() driver, see Procedure 6.1, “Change an old source driver to the network() driver”.

Procedure 6.1. Change an old source driver to the network() driver

To replace your existing tcp(), tcp6(), udp(), udp6() sources with a network() source, complete the following steps.

  1. Replace the driver with network. For example, replace udp( with network(

  2. Set the transport protocol.

    • If you used TLS-encryption, add the transport("tls") option, then continue with the next step.

    • If you used the tcp or tcp6 driver, add the transport("tcp") option.

    • If you used the udp or udp driver, add the transport("udp") option.

  3. If you use IPv6 (that is, the udp6 or tcp6 driver), add the ip-protocol("6") option.

  4. If you did not specify the port used in the old driver, check the section called “network() source options” and verify that your clients send the messages to the default port of the transport protocol you use. Otherwise, set the appropriate port number in your source using the port() option.

  5. All other options are identical. Test your configuration with the syslog-ng --syntax-only command.

    The following configuration shows a simple tcp source.

    source s_old_tcp {
        tcp(
            ip(127.0.0.1) port(1999)
            tls(
                peer-verify("required-trusted")
                key-file("/opt/syslog-ng/etc/syslog-ng/syslog-ng.key")
                cert-file('/opt/syslog-ng/etc/syslog-ng/syslog-ng.crt')
            )
        );
    };

    When replaced with the network() driver, it looks like this.

    source s_new_network_tcp {
        network(
            transport("tls")
            ip(127.0.0.1) port(1999)
            tls(
                peer-verify("required-trusted")
                key-file("/opt/syslog-ng/etc/syslog-ng/syslog-ng.key")
                cert-file('/opt/syslog-ng/etc/syslog-ng/syslog-ng.crt')
            )
        );
    };

Collecting systemd messages using a socket

On platforms running systemd, the systemd-syslog() driver reads the log messages of systemd using the /run/systemd/journal/syslog socket. Note the following points about this driver:

  • If possible, use the more reliable systemd-journal() driver instead.

  • The socket activation of systemd is buggy, causing some log messages to get lost during system startup.

  • If syslog-ng PE is running in a jail or a Linux Container (LXC), it will not read from the /dev/kmsg or /proc/kmsg files.

Declaration: 

systemd-syslog();

Example 6.40. Using the systemd-syslog() driver

@version: 6.0

source s_systemdd {
	systemd-syslog();
};

destination d_network {
	syslog("server.host");
};

log {
	source(s_systemdd);
	destination(d_network);
};

Collecting messages from UNIX domain sockets

The unix-stream() and unix-dgram() drivers open an AF_UNIX socket and start listening on it for messages. The unix-stream() driver is primarily used on Linux and uses SOCK_STREAM semantics (connection oriented, no messages are lost), while unix-dgram() is used on BSDs and uses SOCK_DGRAM semantics: this may result in lost local messages if the system is overloaded.

To avoid denial of service attacks when using connection-oriented protocols, the number of simultaneously accepted connections should be limited. This can be achieved using the max-connections() parameter. The default value of this parameter is quite strict, you might have to increase it on a busy system.

Both unix-stream and unix-dgram have a single required argument that specifies the filename of the socket to create. For the list of available optional parameters, see the section called “unix-stream() and unix-dgram() source options”

Caution:

This feature is currently not available when running the syslog-ng PE application on Microsoft Windows platforms. For a complete list of limitations, see the section called “Limitations on Microsoft Windows platforms”.

Declaration: 

unix-stream(filename [options]);
unix-dgram(filename [options]);

NOTE:

syslogd on Linux originally used SOCK_STREAM sockets, but some distributions switched to SOCK_DGRAM around 1999 to fix a possible DoS problem. On Linux you can choose to use whichever driver you like as syslog clients automatically detect the socket type being used.

Example 6.41. Using the unix-stream() and unix-dgram() drivers

source s_stream { unix-stream("/dev/log" max-connections(10)); };
source s_dgram { unix-dgram("/var/run/log"); };

unix-stream() and unix-dgram() source options

Caution:

This feature is currently not available when running the syslog-ng PE application on Microsoft Windows platforms. For a complete list of limitations, see the section called “Limitations on Microsoft Windows platforms”.

These two drivers behave similarly: they open an AF_UNIX socket and start listening on it for messages. The following options can be specified for these drivers:

create-dirs()
Type: yes or no
Default: no

Description: Enable creating non-existing directories when creating the socket files.

encoding()
Type: string
Default:

Description: Specifies the characterset (encoding, for example UTF-8) of messages using the legacy BSD-syslog protocol. To list the available character sets on a host, execute the iconv -l command. For details on how encoding affects the size of the message, see the section called “Message size and encoding”.

flags()
Type: assume-utf8, empty-lines, expect-hostname, kernel, no-multi-line, no-parse, dont-store-legacy-msghdr, syslog-protocol, validate-utf8
Default: empty set

Description: Specifies the log parsing options of the source.

  • assume-utf8: The assume-utf8 flag assumes that the incoming messages are UTF-8 encoded, but does not verify the encoding. If you explicitly want to validate the UTF-8 encoding of the incoming message, use the validate-utf8 flag.

  • dont-store-legacy-msghdr: By default, syslog-ng stores the original incoming header of the log message. This is useful of the original format of a non-syslog-compliant message must be retained (syslog-ng automatically corrects minor header errors, for example, adds a whitespace before msg in the following message: Jan 22 10:06:11 host program:msg). If you do not want to store the original header of the message, enable the dont-store-legacy-msghdr flag.

  • empty-lines: Use the empty-lines flag to keep the empty lines of the messages. By default, syslog-ng PE removes empty lines automatically.

  • expect-hostname: If the expect-hostname flag is enabled, syslog-ng PE will assume that the log message contains a hostname and parse the message accordingly. This is the default behavior for TCP sources. Note that pipe sources use the no-hostname flag by default.

  • kernel: The kernel flag makes the source default to the LOG_KERN | LOG_NOTICE priority if not specified otherwise.

  • no-hostname: Enable the no-hostname flag if the log message does not include the hostname of the sender host. That way syslog-ng PE assumes that the first part of the message header is ${PROGRAM} instead of ${HOST}. For example:

    source s_dell { network(port(2000) flags(no-hostname)); };
  • no-multi-line: The no-multi-line flag disables line-breaking in the messages: the entire message is converted to a single line. Note that this happens only if the underlying transport method actually supports multi-line messages. Currently the rltp, syslog(), network(), unix-dgram() drivers support multi-line messages.

  • no-parse: By default, syslog-ng PE parses incoming messages as syslog messages. The no-parse flag completely disables syslog message parsing and processes the complete line as the message part of a syslog message. The syslog-ng PE application will generate a new syslog header (timestamp, host, and so on) automatically and put the entire incoming message into the MSG part of the syslog message. This flag is useful for parsing messages not complying to the syslog format.

    If you are using the flags(no-parse) option, then syslog message parsing is completely disabled, and the entire incoming message is treated as the ${MESSAGE} part of a syslog message. In this case, syslog-ng PE generates a new syslog header (timestamp, host, and so on) automatically. Note that since flags(no-parse) disables message parsing, it interferes with other flags, for example, disables flags(no-multi-line).

  • syslog-protocol: The syslog-protocol flag specifies that incoming messages are expected to be formatted according to the new IETF syslog protocol standard (RFC5424), but without the frame header. Note that this flag is not needed for the syslog driver, which handles only messages that have a frame header.

  • validate-utf8: The validate-utf8 flag enables encoding-verification for messages formatted according to the new IETF syslog standard (for details, see the section called “IETF-syslog messages”). If theBOM[10]character is missing, but the message is otherwise UTF-8 compliant, syslog-ng automatically adds the BOM character to the message.

group()
Type: string
Default: root

Description: Set the gid of the socket.

host-override()
Type: string
Default:

Description: Replaces the ${HOST} part of the message with the parameter string.

keep-alive()
Type: yes or no
Default: yes

Description: Selects whether to keep connections open when syslog-ng is restarted, cannot be used with unix-dgram().

keep-timestamp()
Type: yes or no
Default: yes

Description: Specifies whether syslog-ng should accept the timestamp received from the sending application or client. If disabled, the time of reception will be used instead. This option can be specified globally, and per-source as well. The local setting of the source overrides the global option if available.

Caution:

To use the S_ macros, the keep-timestamp() option must be enabled (this is the default behavior of syslog-ng PE).

log-fetch-limit()
Type: number (messages)
Default: 10

Description: The maximum number of messages fetched from a source during a single poll loop. The destination queues might fill up before flow-control could stop reading if log-fetch-limit() is too high.

log-iw-size()
Type: number (messages)
Default: 1000

Description: The size of the initial window, this value is used during flow control. If the max-connections() option is set, the log-iw-size() will be divided by the number of connections, otherwise log-iw-size() is divided by 10 (the default value of the max-connections() option). The resulting number is the initial window size of each connection. For optimal performance when receiving messages from syslog-ng PE clients, make sure that the window size is larger than the flush-lines() option set in the destination of your clients.

Example 6.42. Initial window size of a connection

If log-iw-size(1000) and max-connections(10), then each connection will have an initial window size of 100.


log-msg-size()
Type: number (bytes)
Default: Use the global log-msg-size() option, which defaults to 65535.

Description: Specifies the maximum length of incoming log messages. Uses the value of the global option if not specified. For details on how encoding affects the size of the message, see the section called “Message size and encoding”.

log-prefix() (DEPRECATED)
Type: string
Default:

Description: A string added to the beginning of every log message. It can be used to add an arbitrary string to any log source, though it is most commonly used for adding kernel: to the kernel messages on Linux. NOTE: This option is deprecated. Use program-override() instead.

max-connections()
Type: number (simultaneous connections)
Default: 256

Description: Limits the number of simultaneously open connections. Cannot be used with unix-dgram().

multi-line-garbage()
Type: regular expression
Default: empty string

Description: Use the multi-line-garbage() option when processing multi-line messages that contain unneeded parts between the messages. Specify a string or regular expression that matches the beginning of the unneeded message parts. If the multi-line-garbage() option is set, syslog-ng PE ignores the lines between the line matching the multi-line-garbage() and the next line matching multi-line-prefix(). See also the multi-line-prefix() option.

When receiving multi-line messages from a source when the multi-line-garbage() option is set, but no matching line is received between two lines that match multi-line-prefix(), syslog-ng PE will continue to process the incoming lines as a single message until a line matching multi-line-garbage() is received.

Caution:

If the multi-line-garbage() option is set, syslog-ng PE discards lines between the line matching the multi-line-garbage() and the next line matching multi-line-prefix().

NOTE:

Starting with syslog-ng PE version 3.2.1, a message is considered complete if no new lines arrive to the message for 10 seconds, even if no line matching the multi-line-garbage() option is received.

This option is not available for the unix-dgram driver.

multi-line-prefix()
Type: regular expression
Default: empty string

Description: Use the multi-line-prefix() option to process multi-line messages, that is, log messages that contain newline characters (for example, Tomcat logs). Specify a string or regular expression that matches the beginning of the log messages. Use as simple regular expressions as possible, because complex regular expressions can severely reduce the rate of processing multi-line messages. If the multi-line-prefix() option is set, syslog-ng PE ignores newline characters from the source until a line matches the regular expression again, and treats the lines between the matching lines as a single message. See also the multi-line-garbage() option.

NOTE:

Starting with syslog-ng PE version 3.2.1, a message is considered complete if no new lines arrive to the message for 10 seconds, even if no line matching the multi-line-garbage() option is received.

TIP:
  • To make multi-line messages more readable when written to a file, use a template in the destination and instead of the ${MESSAGE} macro, use the following: $(indent-multi-line ${MESSAGE}). This expression inserts a tab after every newline character (except when a tab is already present), indenting every line of the message after the first. For example:

    destination d_file {
        file ("/var/log/messages"
            template("${ISODATE} ${HOST} $(indent-multi-line ${MESSAGE})\n") );
    };

    For details on using templates, see the section called “Templates and macros”.

  • To actually convert the lines of multi-line messages to single line (by replacing the newline characters with whitespaces), use the flags(no-multi-line) option in the source.

Example 6.43. Processing Tomcat logs

The log messages of the Apache Tomcat server are a typical example for multi-line log messages. The messages start with the date and time of the query in the YYYY.MM.DD HH:MM:SS format, as you can see in the following example.

2010.06.09. 12:07:39 org.apache.catalina.startup.Catalina start
SEVERE: Catalina.start:
LifecycleException:  service.getName(): "Catalina";  Protocol handler start failed: java.net.BindException: Address already in use<null>:8080
       at org.apache.catalina.connector.Connector.start(Connector.java:1138)
       at org.apache.catalina.core.StandardService.start(StandardService.java:531)
       at org.apache.catalina.core.StandardServer.start(StandardServer.java:710)
       at org.apache.catalina.startup.Catalina.start(Catalina.java:583)
       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
       at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
       at java.lang.reflect.Method.invoke(Method.java:597)
       at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:288)
       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
       at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
       at java.lang.reflect.Method.invoke(Method.java:597)
       at org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:177)
2010.06.09. 12:07:39 org.apache.catalina.startup.Catalina start
INFO: Server startup in 1206 ms
2010.06.09. 12:45:08 org.apache.coyote.http11.Http11Protocol pause
INFO: Pausing Coyote HTTP/1.1 on http-8080
2010.06.09. 12:45:09 org.apache.catalina.core.StandardService stop
INFO: Stopping service Catalina

To process these messages, specify a regular expression matching the timestamp of the messages in the multi-line-prefix() option. Such an expression is the following:

source s_file{ file("/var/log/tomcat6/catalina.2010-06-09.log" follow-freq(0) multi-line-prefix("[0-9]{4}\.[0-9]{2}\.[0-9]{2}\.") flags(no-parse));
};

Note that the flags(no-parse) is needed to avoid syslog-ng PE trying to interpret the date in the message.


This option is not available for the unix-dgram driver.

optional()
Type: yes or no
Default:

Description: Instruct syslog-ng to ignore the error if a specific source cannot be initialized. No other attempts to initialize the source will be made until the configuration is reloaded. This option currently applies to the pipe(), unix-dgram, and unix-stream drivers.

owner()
Type: string
Default: root

Description: Set the uid of the socket.

pad-size()
Type: number (bytes)
Default: 0

Description: Specifies input padding. Some operating systems (such as HP-UX) pad all messages to block boundary. This option can be used to specify the block size. (HP-UX uses 2048 bytes). The syslog-ng PE application will pad reads from the associated device to the number of bytes set in pad-size(). Mostly used on HP-UX where /dev/log is a named pipe and every write is padded to 2048 bytes. If pad-size() was given and the incoming message does not fit into pad-size(), syslog-ng will not read anymore from this pipe and displays the following error message:

Padding was set, and couldn't read enough bytes
perm()
Type: number (octal notation)
Default: 0666

Description: Set the permission mask. For octal numbers prefix the number with '0', for example: use 0755 for rwxr-xr-x.

program-override()
Type: string
Default:

Description: Replaces the ${PROGRAM} part of the message with the parameter string. For example, to mark every message coming from the kernel, include the program-override("kernel") option in the source containing /proc/kmsg.

so-keepalive()
Type: yes or no
Default: no

Description: Enables keep-alive messages, keeping the socket open. This only effects TCP and UNIX-stream sockets. For details, see the socket(7) manual page.

so-rcvbuf()
Type: number (bytes)
Default: 0

Description: Specifies the size of the socket receive buffer in bytes. For details, see the socket(7) manual page.

Caution:

When receiving messages using the UDP protocol, increase the size of the UDP receive buffer on the receiver host (that is, the syslog-ng PE server or relay receiving the messages). Note that on certain platforms, for example, on Red Hat Enterprise Linux 5, even low message load (~200 messages per second) can result in message loss, unless the so-rcvbuf() option of the source is increased. In such cases, you will need to increase the net.core.rmem_max parameter of the host (for example, to 1024000), but do not modify net.core.rmem_default parameter.

As a general rule, increase the so-rcvbuf() so that the buffer size in kilobytes is higher than the rate of incoming messages per second. For example, to receive 2000 messages per second, set the so-rcvbuf() at least to 2 097 152 bytes.

tags()
Type: string
Default:

Description: Label the messages received from the source with custom tags. Tags must be unique, and enclosed between double quotes. When adding multiple tags, separate them with comma, for example tags("dmz", "router"). This option is available only in syslog-ng 3.1 and later.

time-zone()
Type: name of the timezone, or the timezone offset
Default:

Description: The default timezone for messages read from the source. Applies only if no timezone is specified within the message itself.

The timezone can be specified as using the name of the (for example time-zone("Europe/Budapest")), or as the timezone offset in +/-HH:MM format (for example +01:00). On Linux and UNIX platforms, the valid timezone names are listed under the /usr/share/zoneinfo directory.

use-syslogng-pid()
Type: yes or no
Default: no

Description: If the value of this option is yes, then the PID value of the message will be overridden with the PID of the running syslog-ng process.



[10] The byte order mark (BOM) is a Unicode character used to signal the byte-order of the message text.

Chapter 7. Sending and storing log messages — destinations and destination drivers

A destination is where a log message is sent if the filtering rules match. Similarly to sources, destinations consist of one or more drivers, each defining where and how messages are sent.

TIP:

If no drivers are defined for a destination, all messages sent to the destination are discarded. This is equivalent to omitting the destination from the log statement.

To define a destination, add a destination statement to the syslog-ng configuration file using the following syntax:

destination <identifier> {
                destination-driver(params); destination-driver(params); ... };

Example 7.1. A simple destination statement

The following destination statement sends messages to the TCP port 1999 of the 10.1.2.3 host.

destination d_demo_tcp { network("10.1.2.3" port(1999)); };

If name resolution is configured, you can use the hostname of the target server as well.

destination d_tcp { network("target_host" port(1999)); };

Caution:
  • Do not define the same drivers with the same parameters more than once, because it will cause problems. For example, do not open the same file in multiple destinations.

  • Do not use the same destination in different log paths, because it can cause problems with most destination types. Instead, use filters and log paths to avoid such situations.

  • Sources and destinations are initialized only when they are used in a log statement. For example, syslog-ng PE starts listening on a port or starts polling a file only if the source is used in a log statement. For details on creating log statements, see Chapter 8, Routing messages: log paths, reliability, and filters.

  • Hazard of data loss! If your log files are on an NFS-mounted network file system, see the section called “NFS file system for log files”.

The following table lists the destination drivers available in syslog-ng PE.

Table 7.1. Destination drivers available in syslog-ng

Name Description
elasticsearch and elasticsearch2 Sends messages to an Elasticsearch server. The elasticsearch2 driver supports Elasticsearch version 2.0-2.4.6.
file() Writes messages to the specified file.
hdfs Sends messages into a file on a Hadoop Distributed File System (HDFS) node.
kafka Publishes log messages to the Apache Kafka message bus, where subscribers can access them.
logstore() Writes messages to the specified binary logstore file.
mongodb() Sends messages to a MongoDB database.
network() Sends messages to a remote host using the BSD-syslog protocol over IPv4 and IPv6. Supports the TCP, UDP,RLTP™, and TLS network protocols.
pipe() Writes messages to the specified named pipe.
program() Forks and launches the specified program, and sends messages to its standard input.
smtp() Sends messages send mail to trusted recipients, through a controlled channel using the SMTP protocol.
snmp() Sends messages to the specified remote host using the SNMP v2c or v3 protocol.
sql() Sends messages into an SQL database. In addition to the standard syslog-ng packages, the sql() destination requires database-specific packages to be installed. Refer to the section appropriate for your platform in Chapter 3, Installing syslog-ng.
syslog() Sends messages to the specified remote host using the IETF-syslog protocol. The IETF standard supports message transport using the UDP, TCP, and TLS networking protocols.
unix-dgram() Sends messages to the specified unix socket in SOCK_DGRAM style (BSD).
unix-stream() Sends messages to the specified unix socket in SOCK_STREAM style (Linux).
usertty() Sends messages to the terminal of the specified user, if the user is logged in.

Sending messages directly to Elasticsearch version 1.x

Starting with version 5.6 of syslog-ng PE can directly send log messages to Elasticsearch, allowing you to search and analyze your data in real time, and visualize it with Kibana.

NOTE:

In order to use this destination, syslog-ng Premium Edition must run in server mode. Typically, only the central syslog-ng Premium Edition server uses this destination. For details on the server mode, see the section called “Server mode”.

Note the following limitations when using the syslog-ng PEelasticsearch destination:

  • This destination is only supported on the Linux platforms that use the linux glibc2.11 installer, including: Red Hat ES 7, Ubuntu 14.04 (Trusty Tahr).

  • Since syslog-ng PE uses the official Java Elasticsearch libraries, the elasticsearch destination has significant memory usage.

  • The log messages of the underlying client libraries are available in the internal() source of syslog-ng PE.

Declaration: 

@module mod-java
@include "scl.conf"

elasticsearch(
    index("syslog-ng_${YEAR}.${MONTH}.${DAY}")
    type("test")
    cluster("syslog-ng")
);

Example 7.2. Sending log data to Elasticsearch version 1.x

The following example defines an elasticsearch destination that sends messages in transport mode to an Elasticsearch server version 1.x running on the localhost, using only the required parameters.

@module mod-java
@include "scl.conf"

destination d_elastic {
  elasticsearch(
    index("syslog-ng_${YEAR}.${MONTH}.${DAY}")
    type("test")
  );
};

The following example sends 10000 messages in a batch, in node mode, and includes a custom unique ID for each message.

@module mod-java
@include "scl.conf"

options {
  threaded(yes);
  use_uniqid(yes);
};

source s_syslog {
  syslog();
};

destination d_elastic {
  elasticsearch(
    index("syslog-ng_${YEAR}.${MONTH}.${DAY}")
    type("test")
    cluster("syslog-ng")
    client_mode("node")
    custom_id("${UNIQID}")
    flush-limit("10000")
  );
};

log {
  source(s_syslog);
  destination(d_elastic);
  flags(flow-control);
};

Procedure 7.1. Prerequisites

To send messages from syslog-ng PE to Elasticsearch, complete the following steps.

Steps: 

  1. If you want to use the Java-based modules of syslog-ng PE (for example, the Elasticsearch, HDFS, or Kafka destinations), you must compile syslog-ng PE with Java support.

    • Download and install the Java Runtime Environment (JRE), 1.7 (or newer). The Java-based modules of syslog-ng PE are tested and supported when using the Oracle implementation of Java. Other implementations are untested and unsupported, they may or may not work as expected.

    • Install gradle version 2.2.1 or newer.

    • Set LD_LIBRARY_PATH to include the libjvm.so file, for example:LD_LIBRARY_PATH=/usr/lib/jvm/java-7-openjdk-amd64/jre/lib/amd64/server:$LD_LIBRARY_PATH

      Note that many platforms have a simplified links for Java libraries. Use the simplified path if available. If you use a startup script to start syslog-ng PE set LD_LIBRARY_PATH in the script as well.

    • If you are behind an HTTP proxy, create a gradle.properties under the modules/java-modules/ directory. Set the proxy parameters in the file. For details, see The Gradle User Guide.

  2. Download the Elasticsearch libraries version 1.5 or newer from the 1.x line from https://www.elastic.co/downloads/elasticsearch.One Identity tests the destination using Elasticsearch version 1.5. To use Elasticsearch 2.0-2.4.6, use the elasticsearch2() destination (see the section called “Sending messages directly to Elasticsearch version 2.0 or higher”).

  3. Extract the Elasticsearch libraries into a temporary directory, then collect the various .jar files into a single directory (for example, /opt/elasticsearch/lib/) where syslog-ng PE can access them. You must specify this directory in the syslog-ng PE configuration file. The files are located in the lib directory and its subdirectories of the Elasticsearch release package.

How syslog-ng PE interacts with Elasticsearch

The syslog-ng PE application sends the log messages to the official Elasticsearch client library, which forwards the data to the Elasticsearch nodes. The way how syslog-ng PE interacts with Elasticsearch is described in the following steps.

  • After syslog-ng PE is started and the first message arrives to the elasticsearch destination, the elasticsearch destination tries to connect to the Elasticsearch server or cluster. If the connection fails, syslog-ng PE will repeatedly attempt to connect again after the period set in time-reopen() expires.

  • If the connection is established, syslog-ng PE sends JSON-formatted messages to Elasticsearch.

    • If flush_limit is set to 1: syslog-ng PE sends the message reliably: it sends a message to Elasticsearch, then waits for a reply from Elasticsearch. In case of failure, syslog-ng PE repeats sending the message, as set in the retries() parameter. If sending the message fails for retries() times, syslog-ng PE drops the message.

      This method ensures reliable message transfer, but is slow.

    • If flush_limit is higher than 1: syslog-ng PE sends messages in a batch, and receives the response asynchronously. In case of a problem, syslog-ng PE cannot resend the messages.

      This method is relatively fast (depending on the size of flush_limit), but the transfer is not reliable. In transport mode, several messages can be lost before syslog-ng PE recognizes the error. Node mode is more reliable in this sense, because the message loss rate is significantly lower.

    • If concurrent-requests is higher than 1, syslog-ng PE can send multiple batches simultaneously, increasing performance (and also the number of messages that can be lost in case of an error). For details, see the section called “concurrent_requests()”.

Client modes

The syslog-ng PE application can interact with Elasticsearch in transport mode or node mode.

  • Transport mode. The syslog-ng PE application uses the transport client API of Elasticsearch, and uses the server(), port(), and cluster() options from the syslog-ng PE configuration file.

  • Node mode. The syslog-ng PE application acts as an Elasticsearch node (client no-data), using the node client API of Elasticsearch. Further options for the node can be describe in an Elasticsearch configuration file specified in the resource() option.

    NOTE:

    In Node mode, it is required to define the home of the elasticsearch installation with the path.home paramter in the .yml file. For example: path.home: /usr/share/elasticsearch.

  • Shield mode. This mode is available only from syslog-ng PE version 5.6. In this mode, syslog-ng PE uses the transport client API of Elasticsearch, and uses the server(), port(), and cluster() options from the syslog-ng PE configuration file, but with Shield (X-Pack security) support. For more details about Shield, see: https://www.elastic.co/products/x-pack/security.

Elasticsearch destination options

The elasticsearch destination can directly send log messages to Elasticsearch, allowing you to search and analyze your data in real time, and visualize it with Kibana. The elasticsearch destination has the following options.

Required options: 

The following options are required: index(), type(). In node mode, either the cluster() or the resource() option is required as well. Note that to use elasticsearch, you must add the following lines to the beginning of your syslog-ng PE configuration:

@module mod-java
@include "scl.conf"
client_lib_dir()
Type: string
Default: N/A

Description: Include the path to the directory where you copied the required libraries (see Procedure 7.1, “Prerequisites”), for example, client_lib_dir(/user/share/elasticsearch-2.2.0/lib).

client_mode()
Type: transport | node | shield
Default: node

Description: Specifies the client mode used to connect to the Elasticsearch server, for example, option("client-mode", "transport").

  • Transport mode. The syslog-ng PE application uses the transport client API of Elasticsearch, and uses the server(), port(), and cluster() options from the syslog-ng PE configuration file.

  • Node mode. The syslog-ng PE application acts as an Elasticsearch node (client no-data), using the node client API of Elasticsearch. Further options for the node can be describe in an Elasticsearch configuration file specified in the resource() option.

    NOTE:

    In Node mode, it is required to define the home of the elasticsearch installation with the path.home paramter in the .yml file. For example: path.home: /usr/share/elasticsearch.

  • Shield mode. This mode is available only from syslog-ng PE version 5.6. In this mode, syslog-ng PE uses the transport client API of Elasticsearch, and uses the server(), port(), and cluster() options from the syslog-ng PE configuration file, but with Shield (X-Pack security) support. For more details about Shield, see: https://www.elastic.co/products/x-pack/security.

NOTE:

In Node mode, it is required to define the home of the elasticsearch installation with the path.home paramter in the .yml file. For example: path.home: /usr/share/elasticsearch.

  • To use this mode, add the Shield .jar file (shield-x.x.x.jar) to the same directory where your Elasticsearch .jar files are located. You can download the Shield distribution and extract the .jar file manually, or you can get it from the Elasticsearch Maven repository.

    It inherits the Transport mode options, but the Shield-related options must be configured in the .yml file (see the resource() option of syslog-ng PE). For more details about the possible options, see: https://www.elastic.co/guide/en/shield/current/reference.html#ref-ssl-tls-settings.

    Example 7.3. Example for the .yml file

    shield.user: es_admin:********
    shield.transport.ssl: true
    shield.ssl.keystore.path: /usr/share/elasticsearch/node.jks
    shield.ssl.keystore.password: mypassword
    

cluster()
Type: string
Default: N/A

Description: Specifies the name or the Elasticsearch cluster, for example, cluster("my-elasticsearch-cluster"). Optionally, you can specify the name of the cluster in the Elasticsearch resource file. For details, see the section called “resource()”.

concurrent_requests()
Type: number
Default: 0

Description: The number of concurrent (simultaneous) requests that syslog-ng PE sends to the Elasticsearch server. Set this option to 1 or higher to increase performance. When using the concurrent_requests() option, make sure that the flush-limit() option is higher than one, otherwise it will not have any noticeable effect. For details, see the section called “flush_limit()”.

Caution:

Hazard of data loss! Using the concurrent-requests() option increases the number of messages lost in case the Elasticsearch server becomes unaccessible.

custom_id()
Type: template or template function
Default: N/A

Description: Use this option to specify a custom ID for the records inserted into Elasticsearch. If this option is not set, the Elasticsearch server automatically generates and ID for the message. For example: custom_id(${UNIQID}) (Note that to use the ${UNIQID} macro, the use-uniqid() global option must be enabled. For details, see the section called “use-uniqid()”.)

disk-buffer()

Description: This option enables putting outgoing messages into the disk buffer of the destination to avoid message loss in case of a system failure on the destination side. It has the following options:

reliable()
Type: yes|no
Default: no

Description: If set to yes, syslog-ng PE cannot lose logs in case of reload/restart, unreachable destination or syslog-ng PE crash. This solution provides a slower, but reliable disk-buffer option. It is created and initialized at startup and gradually grows as new messages arrive. If set to no, the normal disk-buffer will be used. This provides a faster, but less reliable disk-buffer option.

Caution:

Hazard of data loss! If you change the value of reliable() option when there are messages in the disk-buffer, the messages stored in the disk-buffer will be lost.

dir()
Type: string
Default: N/A

Description: Defines the folder where the disk-buffer files are stored. This option has priority over --qdisk-dir=.

Caution:

When creating a new dir() option for a disk buffer, or modifying an existing one, make sure you delete the persist file, or at least remove the relevant persist-entry.

syslog-ng PE creates disk-buffer files based on the path recorded in the persist file. Therefore, if the persist file or the relevant entry is not deleted after modifying the dir() option, then following a restart, syslog-ng PE will look for or create disk-buffer files in their old location. To ensure that syslog-ng PE uses the new dir() setting, the persist file must not contain any information about the destinations which the disk-buffer file in question belongs to.

disk-buf-size()
Type: number (bytes)
Default:
Description: This is a required option. The maximum size of the disk-buffer in bytes. The minimum value is 1048576 bytes. If you set a smaller value, the minimum value will be used automatically. It replaces the old log-disk-fifo-size() option.
mem-buf-length()
Type: number (messages)
Default: 10000
Description: Use this option if the option reliable() is set to no. This option contains the number of messages stored in overflow queue. It replaces the old log-fifo-size() option. It inherits the value of the global log-fifo-size() option if provided. If it is not provided, the default value is 10000 messages. Note that this option will be ignored if the option reliable() is set to yes.
mem-buf-size()
Type: number (bytes)
Default: 163840000
Description: Use this option if the option reliable() is set to yes. This option contains the size of the messages in bytes that is used in the memory part of the disk buffer. It replaces the old log-fifo-size() option. It does not inherit the value of the global log-fifo-size() option, even if it is provided. Note that this option will be ignored if the option reliable() is set to no.
quot-size()
Type: number (messages)
Default: 64
Description: The number of messages stored in the output buffer of the destination.

Options reliable() and disk-buf-size() are required options.

Example 7.4. Examples for using disk-buffer()

In the following case reliable disk-buffer() is used.

destination d_demo {
    network(
            "127.0.0.1"
            port(3333)
            disk-buffer(
                mem-buf-size(10000)
                disk-buf-size(2000000)
                reliable(yes)
                dir("/tmp/disk-buffer")
            )
        );
};

In the following case normal disk-buffer() is used.

destination d_demo {
    network(
            "127.0.0.1"
            port(3333)
            disk-buffer(
                mem-buf-length(10000)
                disk-buf-size(2000000)
                reliable(no)
                dir("/tmp/disk-buffer")
            )
        );
};

flush_limit()
Type: number
Default: 5000

Description: The number of messages that syslog-ng PE sends to the Elasticsearch server in a single batch.

  • If flush_limit is set to 1: syslog-ng PE sends the message reliably: it sends a message to Elasticsearch, then waits for a reply from Elasticsearch. In case of failure, syslog-ng PE repeats sending the message, as set in the retries() parameter. If sending the message fails for retries() times, syslog-ng PE drops the message.

    This method ensures reliable message transfer, but is slow.

  • If flush_limit is higher than 1: syslog-ng PE sends messages in a batch, and receives the response asynchronously. In case of a problem, syslog-ng PE cannot resend the messages.

    This method is relatively fast (depending on the size of flush_limit), but the transfer is not reliable. In transport mode, several messages can be lost before syslog-ng PE recognizes the error. Node mode is more reliable in this sense, because the message loss rate is significantly lower.

  • If concurrent-requests is higher than 1, syslog-ng PE can send multiple batches simultaneously, increasing performance (and also the number of messages that can be lost in case of an error). For details, see the section called “concurrent_requests()”.

frac-digits()
Type: number (digits of fractions of a second)
Default: Value of the global option (which defaults to 0)

Description: The syslog-ng application can store fractions of a second in the timestamps according to the ISO8601 format. The frac-digits() parameter specifies the number of digits stored. The digits storing the fractions are padded by zeros if the original timestamp of the message specifies only seconds. Fractions can always be stored for the time the message was received. Note that syslog-ng can add the fractions to non-ISO8601 timestamps as well.

index()
Type: string
Default: N/A

Description: Name of the Elasticsearch index to store the log messages. You can use macros and templates as well. For example, index("syslog-ng_${YEAR}.${MONTH}.${DAY}").

log-fifo-size()
Type: number (messages)
Default: Use global setting.

Description: The number of messages that the output queue can store.

on-error()
Accepted values: drop-message|drop-property|fallback-to-string|silently-drop-message|silently-drop-property|silently-fallback-to-string
Default: Use the global setting (which defaults to drop-message)

Description: Controls what happens when type-casting fails and syslog-ng PE cannot convert some data to the specified type. By default, syslog-ng PE drops the entire message and logs the error. Currently the value-pairs() option uses the settings of on-error().

  • drop-message: Drop the entire message and log an error message to the internal() source. This is the default behavior of syslog-ng PE.

  • drop-property: Omit the affected property (macro, template, or message-field) from the log message and log an error message to the internal() source.

  • fallback-to-string: Convert the property to string and log an error message to the internal() source.

  • silently-drop-message: Drop the entire message silently, without logging the error.

  • silently-drop-property: Omit the affected property (macro, template, or message-field) silently, without logging the error.

  • silently-fallback-to-string: Convert the property to string silently, without logging the error.

port()
Type: number
Default: 9300

Description: The port number of the Elasticsearch server. This option is used only in transport mode: client-mode("transport")

retries()
Type: number (of attempts)
Default: 3

Description: The number of times syslog-ng PE attempts to send a message to this destination. If syslog-ng PE could not send a message, it will try again until the number of attempts reaches retries, then drops the message.

resource()
Type: string
Default: N/A

Description: The list of Elasticsearch resources to load, separated by semicolons. For example, resource("/home/user/elasticsearch/elasticsearch.yml;/home/user/elasticsearch/elasticsearch2.yml").

server()
Type: list of hostnames
Default: 127.0.0.1

Description: Specifies the hostname or IP address of the Elasticsearch server. When specifying an IP address, IPv4 (for example, 192.168.0.1) or IPv6 (for example, [::1]) can be used as well. When specifying multiple addresses, use space to separate the addresses, for example, server("127.0.0.1 remote-server-hostname1 remote-server-hostname2")

This option is used only in transport mode: client-mode("transport")

template()
Type: template or template function
Default: $(format-json --scope rfc5424 --exclude DATE --key ISODATE @timestamp=${ISODATE})

Description: The message as sent to the Elasticsearch server. Typically, you will want to use the command-line notation of the format-json template function.

To add a @timestamp field to the message, for example, to use with Kibana, include the @timestamp=${ISODATE} expression in the template. For example: template($(format-json --scope rfc5424 --exclude DATE --key ISODATE @timestamp=${ISODATE}))

For details on formatting messages in JSON format, see the section called “format-json”.

throttle()
Type: number (messages per second)
Default: 0

Description: Sets the maximum number of messages sent to the destination per second. Use this output-rate-limiting functionality only when using disk-buffer as well to avoid the risk of losing messages. Specifying 0 or a lower value sets the output limit to unlimited.

time-zone()
Type: name of the timezone, or the timezone offset
Default: unspecified

Description: Convert timestamps to the timezone specified by this option. If this option is not set, then the original timezone information in the message is used. Converting the timezone changes the values of all date-related macros derived from the timestamp, for example, HOUR. For the complete list of such macros, see the section called “Date-related macros”.

The timezone can be specified as using the name of the (for example time-zone("Europe/Budapest")), or as the timezone offset in +/-HH:MM format (for example +01:00). On Linux and UNIX platforms, the valid timezone names are listed under the /usr/share/zoneinfo directory.

ts-format()
Type: rfc3164, bsd, rfc3339, iso
Default: Use the global option (which defaults to rfc3164)

Description: Override the global timestamp format (set in the global ts-format() parameter) for the specific destination. For details, see the section called “A note on timezones and timestamps”.

type()
Type: string
Default: N/A

Description: The type of the index. For example, type("test").

Related Documents