Chat now with support
Chat with Support

syslog-ng Premium Edition 6.0.17 - Administration Guide

Preface Chapter 1. Introduction to syslog-ng Chapter 2. The concepts of syslog-ng Chapter 3. Installing syslog-ng Chapter 4. The syslog-ng PE quick-start guide Chapter 5. The syslog-ng PE configuration file Chapter 6. Collecting log messages — sources and source drivers Chapter 7. Sending and storing log messages — destinations and destination drivers Chapter 8. Routing messages: log paths, reliability, and filters Chapter 9. Global options of syslog-ng PE Chapter 10. TLS-encrypted message transfer Chapter 11. FIPS-compliant syslog-ng Chapter 12.  Reliable Log Transfer Protocol™ Chapter 13. Reliability and minimizing the loss of log messages Chapter 14. Manipulating messages Chapter 15. Parsing and segmenting structured messages Chapter 16. Processing message content with a pattern database Chapter 17. Statistics and metrics of syslog-ng Chapter 18. Multithreading and scaling in syslog-ng PE Chapter 19. Troubleshooting syslog-ng Chapter 20. Best practices and examples

Sending messages directly to Elasticsearch version 2.0 or higher

Sending messages directly to Elasticsearch version 2.0-2.4.6

Starting with version 5.6 of syslog-ng PE can directly send log messages to Elasticsearch, allowing you to search and analyze your data in real time, and visualize it with Kibana.

NOTE:

In order to use this destination, syslog-ng Premium Edition must run in server mode. Typically, only the central syslog-ng Premium Edition server uses this destination. For details on the server mode, see the section called “Server mode”.

Note the following limitations when using the syslog-ng PEelasticsearch2 destination:

  • This destination is only supported on the Linux platforms that use the linux glibc2.11 installer, including: Red Hat ES 7, Ubuntu 14.04 (Trusty Tahr).

  • Since syslog-ng PE uses the official Java Elasticsearch libraries, the elasticsearch2 destination has significant memory usage.

  • The log messages of the underlying client libraries are available in the internal() source of syslog-ng PE.

Declaration: 

@module mod-java
@include "scl.conf"

elasticsearch2(
    index("syslog-ng_${YEAR}.${MONTH}.${DAY}")
    type("test")
    cluster("syslog-ng")
);

Example 7.5. Sending log data to Elasticsearch version 2.0-2.4.6

The following example defines an elasticsearch2 destination that sends messages in transport mode to an Elasticsearch server running on the localhost, using only the required parameters.

@module mod-java
@include "scl.conf"

destination d_elastic {
  elasticsearch2(
    index("syslog-ng_${YEAR}.${MONTH}.${DAY}")
    type("test")
  );
};

The following example sends 10000 messages in a batch, in node mode, and includes a custom unique ID for each message.

@module mod-java
@include "scl.conf"

options {
  threaded(yes);
  use_uniqid(yes);
};

source s_syslog {
  syslog();
};

destination d_elastic {
  elasticsearch2(
    index("syslog-ng_${YEAR}.${MONTH}.${DAY}")
    type("test")
    cluster("syslog-ng")
    client_mode("node")
    custom_id("${UNIQID}")
    flush-limit("10000")
  );
};

log {
  source(s_syslog);
  destination(d_elastic);
  flags(flow-control);
};

Procedure 7.2. Prerequisites

To send messages from syslog-ng PE to Elasticsearch, complete the following steps.

Steps: 

  1. Download and install the Java Runtime Environment (JRE), 2.x (or newer). The syslog-ng PEelasticsearch2 destination is tested and supported when using the Oracle implementation of Java. Other implementations are untested and unsupported, they may or may not work as expected.

  2. Download the Elasticsearch libraries (version 2.0-2.4.6) from https://www.elastic.co/downloads/elasticsearch.One Identity tests the destination using Elasticsearch version 2.4.To use Elasticsearch 1.x, use the elasticsearch() destination (see the section called “Sending messages directly to Elasticsearch version 1.x”).

  3. Extract the Elasticsearch libraries into a temporary directory, then collect the various .jar files into a single directory (for example, /opt/elasticsearch/lib/) where syslog-ng PE can access them. You must specify this directory in the syslog-ng PE configuration file. The files are located in the lib directory and its subdirectories of the Elasticsearch release package.

How syslog-ng PE interacts with Elasticsearch

The syslog-ng PE application sends the log messages to the official Elasticsearch client library, which forwards the data to the Elasticsearch nodes. The way how syslog-ng PE interacts with Elasticsearch is described in the following steps.

  • After syslog-ng PE is started and the first message arrives to the elasticsearch2 destination, the elasticsearch2 destination tries to connect to the Elasticsearch server or cluster. If the connection fails, syslog-ng PE will repeatedly attempt to connect again after the period set in time-reopen() expires.

  • If the connection is established, syslog-ng PE sends JSON-formatted messages to Elasticsearch.

    • If flush_limit is set to 1: syslog-ng PE sends the message reliably: it sends a message to Elasticsearch, then waits for a reply from Elasticsearch. In case of failure, syslog-ng PE repeats sending the message, as set in the retries() parameter. If sending the message fails for retries() times, syslog-ng PE drops the message.

      This method ensures reliable message transfer, but is slow.

    • If flush_limit is higher than 1: syslog-ng PE sends messages in a batch, and receives the response asynchronously. In case of a problem, syslog-ng PE cannot resend the messages.

      This method is relatively fast (depending on the size of flush_limit), but the transfer is not reliable. In transport mode, several messages can be lost before syslog-ng PE recognizes the error. Node mode is more reliable in this sense, because the message loss rate is significantly lower.

    • If concurrent-requests is higher than 1, syslog-ng PE can send multiple batches simultaneously, increasing performance (and also the number of messages that can be lost in case of an error). For details, see the section called “concurrent_requests()”.

Client modes

The syslog-ng PE application can interact with Elasticsearch in transport mode or node mode.

  • Transport mode. The syslog-ng PE application uses the transport client API of Elasticsearch, and uses the server(), port(), and cluster() options from the syslog-ng PE configuration file.

  • Node mode. The syslog-ng PE application acts as an Elasticsearch node (client no-data), using the node client API of Elasticsearch. Further options for the node can be describe in an Elasticsearch configuration file specified in the resource() option.

    NOTE:

    In Node mode, it is required to define the home of the elasticsearch installation with the path.home paramter in the .yml file. For example: path.home: /usr/share/elasticsearch.

  • Shield mode. This mode is available only from syslog-ng PE version 5.6. In this mode, syslog-ng PE uses the transport client API of Elasticsearch, and uses the server(), port(), and cluster() options from the syslog-ng PE configuration file, but with Shield (X-Pack security) support. For more details about Shield, see: https://www.elastic.co/products/x-pack/security.

Elasticsearch destination options

The elasticsearch2 destination can directly send log messages to Elasticsearch, allowing you to search and analyze your data in real time, and visualize it with Kibana. The elasticsearch2 destination has the following options.

Required options: 

The following options are required: index(), type(). In node mode, either the cluster() or the resource() option is required as well. Note that to use elasticsearch2, you must add the following lines to the beginning of your syslog-ng PE configuration:

@module mod-java
@include "scl.conf"
client_lib_dir()
Type: string
Default: N/A

Description: Include the path to the directory where you copied the required libraries (see Procedure 7.2, “Prerequisites”), for example, client_lib_dir(/user/share/elasticsearch-2.2.0/lib).

client_mode()
Type: transport | node | shield
Default: node

Description: Specifies the client mode used to connect to the Elasticsearch server, for example, client_mode("node").

  • Transport mode. The syslog-ng PE application uses the transport client API of Elasticsearch, and uses the server(), port(), and cluster() options from the syslog-ng PE configuration file.

  • Node mode. The syslog-ng PE application acts as an Elasticsearch node (client no-data), using the node client API of Elasticsearch. Further options for the node can be describe in an Elasticsearch configuration file specified in the resource() option.

    NOTE:

    In Node mode, it is required to define the home of the elasticsearch installation with the path.home paramter in the .yml file. For example: path.home: /usr/share/elasticsearch.

  • Shield mode. This mode is available only from syslog-ng PE version 5.6. In this mode, syslog-ng PE uses the transport client API of Elasticsearch, and uses the server(), port(), and cluster() options from the syslog-ng PE configuration file, but with Shield (X-Pack security) support. For more details about Shield, see: https://www.elastic.co/products/x-pack/security.

  • To use this mode, add the Shield .jar file (shield-x.x.x.jar) to the same directory where your Elasticsearch .jar files are located. You can download the Shield distribution and extract the .jar file manually, or you can get it from the Elasticsearch Maven repository.

    It inherits the Transport mode options, but the Shield-related options must be configured in the .yml file (see the resource() option of syslog-ng PE). For more details about the possible options, see: https://www.elastic.co/guide/en/shield/current/reference.html#ref-ssl-tls-settings.

    Example 7.6. Example for the .yml file

    shield.user: es_admin:********
    shield.transport.ssl: true
    shield.ssl.keystore.path: /usr/share/elasticsearch/node.jks
    shield.ssl.keystore.password: mypassword
    

cluster()
Type: string
Default: N/A

Description: Specifies the name or the Elasticsearch cluster, for example, cluster("my-elasticsearch-cluster"). Optionally, you can specify the name of the cluster in the Elasticsearch resource file. For details, see the section called “resource()”.

concurrent_requests()
Type: number
Default: 0

Description: The number of concurrent (simultaneous) requests that syslog-ng PE sends to the Elasticsearch server. Set this option to 1 or higher to increase performance. When using the concurrent_requests() option, make sure that the flush-limit() option is higher than one, otherwise it will not have any noticeable effect. For details, see the section called “flush_limit()”.

Caution:

Hazard of data loss! Using the concurrent-requests() option increases the number of messages lost in case the Elasticsearch server becomes unaccessible.

custom_id()
Type: template or template function
Default: N/A

Description: Use this option to specify a custom ID for the records inserted into Elasticsearch. If this option is not set, the Elasticsearch server automatically generates and ID for the message. For example: custom_id(${UNIQID}) (Note that to use the ${UNIQID} macro, the use-uniqid() global option must be enabled. For details, see the section called “use-uniqid()”.)

disk-buffer()

Description: This option enables putting outgoing messages into the disk buffer of the destination to avoid message loss in case of a system failure on the destination side. It has the following options:

reliable()
Type: yes|no
Default: no

Description: If set to yes, syslog-ng PE cannot lose logs in case of reload/restart, unreachable destination or syslog-ng PE crash. This solution provides a slower, but reliable disk-buffer option. It is created and initialized at startup and gradually grows as new messages arrive. If set to no, the normal disk-buffer will be used. This provides a faster, but less reliable disk-buffer option.

Caution:

Hazard of data loss! If you change the value of reliable() option when there are messages in the disk-buffer, the messages stored in the disk-buffer will be lost.

dir()
Type: string
Default: N/A

Description: Defines the folder where the disk-buffer files are stored. This option has priority over --qdisk-dir=.

Caution:

When creating a new dir() option for a disk buffer, or modifying an existing one, make sure you delete the persist file, or at least remove the relevant persist-entry.

syslog-ng PE creates disk-buffer files based on the path recorded in the persist file. Therefore, if the persist file or the relevant entry is not deleted after modifying the dir() option, then following a restart, syslog-ng PE will look for or create disk-buffer files in their old location. To ensure that syslog-ng PE uses the new dir() setting, the persist file must not contain any information about the destinations which the disk-buffer file in question belongs to.

disk-buf-size()
Type: number (bytes)
Default:
Description: This is a required option. The maximum size of the disk-buffer in bytes. The minimum value is 1048576 bytes. If you set a smaller value, the minimum value will be used automatically. It replaces the old log-disk-fifo-size() option.
mem-buf-length()
Type: number (messages)
Default: 10000
Description: Use this option if the option reliable() is set to no. This option contains the number of messages stored in overflow queue. It replaces the old log-fifo-size() option. It inherits the value of the global log-fifo-size() option if provided. If it is not provided, the default value is 10000 messages. Note that this option will be ignored if the option reliable() is set to yes.
mem-buf-size()
Type: number (bytes)
Default: 163840000
Description: Use this option if the option reliable() is set to yes. This option contains the size of the messages in bytes that is used in the memory part of the disk buffer. It replaces the old log-fifo-size() option. It does not inherit the value of the global log-fifo-size() option, even if it is provided. Note that this option will be ignored if the option reliable() is set to no.
quot-size()
Type: number (messages)
Default: 64
Description: The number of messages stored in the output buffer of the destination.

Options reliable() and disk-buf-size() are required options.

Example 7.7. Examples for using disk-buffer()

In the following case reliable disk-buffer() is used.

destination d_demo {
    network(
            "127.0.0.1"
            port(3333)
            disk-buffer(
                mem-buf-size(10000)
                disk-buf-size(2000000)
                reliable(yes)
                dir("/tmp/disk-buffer")
            )
        );
};

In the following case normal disk-buffer() is used.

destination d_demo {
    network(
            "127.0.0.1"
            port(3333)
            disk-buffer(
                mem-buf-length(10000)
                disk-buf-size(2000000)
                reliable(no)
                dir("/tmp/disk-buffer")
            )
        );
};

flush_limit()
Type: number
Default: 5000

Description: The number of messages that syslog-ng PE sends to the Elasticsearch server in a single batch.

  • If flush_limit is set to 1: syslog-ng PE sends the message reliably: it sends a message to Elasticsearch, then waits for a reply from Elasticsearch. In case of failure, syslog-ng PE repeats sending the message, as set in the retries() parameter. If sending the message fails for retries() times, syslog-ng PE drops the message.

    This method ensures reliable message transfer, but is slow.

  • If flush_limit is higher than 1: syslog-ng PE sends messages in a batch, and receives the response asynchronously. In case of a problem, syslog-ng PE cannot resend the messages.

    This method is relatively fast (depending on the size of flush_limit), but the transfer is not reliable. In transport mode, several messages can be lost before syslog-ng PE recognizes the error. Node mode is more reliable in this sense, because the message loss rate is significantly lower.

  • If concurrent-requests is higher than 1, syslog-ng PE can send multiple batches simultaneously, increasing performance (and also the number of messages that can be lost in case of an error). For details, see the section called “concurrent_requests()”.

frac-digits()
Type: number (digits of fractions of a second)
Default: Value of the global option (which defaults to 0)

Description: The syslog-ng application can store fractions of a second in the timestamps according to the ISO8601 format. The frac-digits() parameter specifies the number of digits stored. The digits storing the fractions are padded by zeros if the original timestamp of the message specifies only seconds. Fractions can always be stored for the time the message was received. Note that syslog-ng can add the fractions to non-ISO8601 timestamps as well.

index()
Type: string
Default: N/A

Description: Name of the Elasticsearch index to store the log messages. You can use macros and templates as well. For example, index("syslog-ng_${YEAR}.${MONTH}.${DAY}").

log-fifo-size()
Type: number (messages)
Default: Use global setting.

Description: The number of messages that the output queue can store.

on-error()
Accepted values: drop-message|drop-property|fallback-to-string|silently-drop-message|silently-drop-property|silently-fallback-to-string
Default: Use the global setting (which defaults to drop-message)

Description: Controls what happens when type-casting fails and syslog-ng PE cannot convert some data to the specified type. By default, syslog-ng PE drops the entire message and logs the error. Currently the value-pairs() option uses the settings of on-error().

  • drop-message: Drop the entire message and log an error message to the internal() source. This is the default behavior of syslog-ng PE.

  • drop-property: Omit the affected property (macro, template, or message-field) from the log message and log an error message to the internal() source.

  • fallback-to-string: Convert the property to string and log an error message to the internal() source.

  • silently-drop-message: Drop the entire message silently, without logging the error.

  • silently-drop-property: Omit the affected property (macro, template, or message-field) silently, without logging the error.

  • silently-fallback-to-string: Convert the property to string silently, without logging the error.

port()
Type: number
Default: 9300

Description: The port number of the Elasticsearch server. This option is used only in transport mode: client_mode("transport")

retries()
Type: number (of attempts)
Default: 3

Description: The number of times syslog-ng PE attempts to send a message to this destination. If syslog-ng PE could not send a message, it will try again until the number of attempts reaches retries, then drops the message.

resource()
Type: string
Default: N/A

Description: The list of Elasticsearch resources to load, separated by semicolons. For example, resource("/home/user/elasticsearch/elasticsearch.yml;/home/user/elasticsearch/elasticsearch2.yml").

server()
Type: list of hostnames
Default: 127.0.0.1

Description: Specifies the hostname or IP address of the Elasticsearch server. When specifying an IP address, IPv4 (for example, 192.168.0.1) or IPv6 (for example, [::1]) can be used as well. When specifying multiple addresses, use space to separate the addresses, for example, server("127.0.0.1 remote-server-hostname1 remote-server-hostname2")

This option is used only in transport mode: client_mode("transport")

template()
Type: template or template function
Default: $(format-json --scope rfc5424 --exclude DATE --key ISODATE @timestamp=${ISODATE})

Description: The message as sent to the Elasticsearch server. Typically, you will want to use the command-line notation of the format-json template function.

To add a @timestamp field to the message, for example, to use with Kibana, include the @timestamp=${ISODATE} expression in the template. For example: template($(format-json --scope rfc5424 --exclude DATE --key ISODATE @timestamp=${ISODATE}))

For details on formatting messages in JSON format, see the section called “format-json”.

throttle()
Type: number (messages per second)
Default: 0

Description: Sets the maximum number of messages sent to the destination per second. Use this output-rate-limiting functionality only when using disk-buffer as well to avoid the risk of losing messages. Specifying 0 or a lower value sets the output limit to unlimited.

time-zone()
Type: name of the timezone, or the timezone offset
Default: unspecified

Description: Convert timestamps to the timezone specified by this option. If this option is not set, then the original timezone information in the message is used. Converting the timezone changes the values of all date-related macros derived from the timestamp, for example, HOUR. For the complete list of such macros, see the section called “Date-related macros”.

The timezone can be specified as using the name of the (for example time-zone("Europe/Budapest")), or as the timezone offset in +/-HH:MM format (for example +01:00). On Linux and UNIX platforms, the valid timezone names are listed under the /usr/share/zoneinfo directory.

ts-format()
Type: rfc3164, bsd, rfc3339, iso
Default: Use the global option (which defaults to rfc3164)

Description: Override the global timestamp format (set in the global ts-format() parameter) for the specific destination. For details, see the section called “A note on timezones and timestamps”.

type()
Type: string
Default: N/A

Description: The type of the index. For example, type("test").

Storing messages in plain-text files

The file driver is one of the most important destination drivers in syslog-ng. It allows to output messages to the specified text file, or to a set of files.

The destination filename may include macros which get expanded when the message is written, thus a simple file() driver may create several files: for example, syslog-ng PE can store the messages of client hosts in a separate file for each host. For more information on available macros see the section called “Macros of syslog-ng PE.

If the expanded filename refers to a directory which does not exist, it will be created depending on the create-dirs() setting (both global and a per destination option).

The file() has a single required parameter that specifies the filename that stores the log messages. For the list of available optional parameters, see the section called “file() destination options”.

Declaration: 

file(filename options());

Example 7.8. Using the file() driver

destination d_file { file("/var/log/messages"); };

NOTE:

On Microsoft Windows platforms, enclose paths and filenames between single quotes, for example, 'C:\temp\logs\mylogs.log'


Example 7.9. Using the file() driver with macros in the file name and a template for the message

destination d_file {
        file("/var/log/${YEAR}.${MONTH}.${DAY}/messages"
             template("${HOUR}:${MIN}:${SEC} ${TZ} ${HOST} [${LEVEL}] ${MSG}\n")
             template-escape(no));
};

NOTE:

When using this destination, update the configuration of your log rotation program to rotate these files. Otherwise, the log files can become very large.

Also, after rotating the log files, reload syslog-ng PE using the syslog-ng-ctl reload command, or use another method to send a SIGHUP to syslog-ng PE.

NOTE:

The syslog-ng PE application uses UNIX-style line-breaks. This means that on Windows, the end of the line is marked with LF instead of CRLF.

Caution:

Since the state of each created file must be tracked by syslog-ng, it consumes some memory for each file. If no new messages are written to a file within 60 seconds (controlled by the time-reap() global option), it is closed, and its state is freed.

Exploiting this, a DoS attack can be mounted against the system. If the number of possible destination files and its needed memory is more than the amount available on the syslog-ng server.

The most suspicious macro is ${PROGRAM}, where the number of possible variations is rather high. Do not use the ${PROGRAM} macro in insecure environments.

file() destination options

The file() driver outputs messages to the specified text file, or to a set of files. The file() destination has the following options:

Caution:

When creating several thousands separate log files, syslog-ng might not be able to open the required number of files. This might happen for example when using the ${HOST} macro in the filename while receiving messages from a large number of hosts. To overcome this problem, adjust the --fd-limit command-line parameter of syslog-ng or the global ulimit parameter of your host. For setting the --fd-limit command-line parameter of syslog-ng see the syslog-ng(8) manual page. For setting the ulimit parameter of the host, see the documentation of your operating system.

create-dirs()
Type: yes or no
Default: no

Description: Enable creating non-existing directories.

dir-group()
Type: string
Default: Use the global settings

Description: The group of the directories created by syslog-ng. To preserve the original properties of an existing directory, use the option without specifying an attribute: dir-group().

dir-owner()
Type: string
Default: Use the global settings

Description: The owner of the directories created by syslog-ng. To preserve the original properties of an existing directory, use the option without specifying an attribute: dir-owner().

dir-perm()
Type: number (octal notation)
Default: Use the global settings

Description: The permission mask of directories created by syslog-ng. Log directories are only created if a file after macro expansion refers to a non-existing directory, and directory creation is enabled (see also the create-dirs() option). For octal numbers prefix the number with 0, for example use 0755 for rwxr-xr-x.

To preserve the original properties of an existing directory, use the option without specifying an attribute: dir-perm(). Note that when creating a new directory without specifying attributes for dir-perm(), the default permission of the directories is masked with the umask of the parent process (typically 0022).

flags()
Type: no-multi-line, syslog-protocol, threaded
Default: empty set

Description: Flags influence the behavior of the destination driver.

  • no-multi-line: The no-multi-line flag disables line-breaking in the messages: the entire message is converted to a single line.

  • syslog-protocol: The syslog-protocol flag instructs the driver to format the messages according to the new IETF syslog protocol standard (RFC5424), but without the frame header. If this flag is enabled, macros used for the message have effect only for the text of the message, the message header is formatted to the new standard. Note that this flag is not needed for the syslog driver, and that the syslog driver automatically adds the frame header to the messages.

  • threaded: The threaded flag enables multithreading for the destination. For details on multithreading, see Chapter 18, Multithreading and scaling in syslog-ng PE.

    NOTE:

    The file destination uses multiple threads only if the destination filename contains macros.

flush-lines()
Type: number (messages)
Default: Use global setting.

Description: Specifies how many lines are sent to a destination at a time. The syslog-ng PE application waits for this number of lines to accumulate and sends them off in a single batch. Setting this number high increases throughput as fully filled frames are sent to the destination, but also increases message latency.

For optimal performance when sending messages to an syslog-ng PE server, make sure that the flush-lines() is smaller than the window size set using the log-iw-size() option in the source of your server.

flush-timeout() (OBSOLETE)
Type: time in milliseconds
Default: Use global setting.

Description: This is an obsolete option. Specifies the time syslog-ng waits for lines to accumulate in its output buffer. For details, see the flush-lines() option.

NOTE:

This option will be removed from the list of acceptable options. After that, your configuration will become invalid if it still contains the flush-timeout() option. To avoid future problems, remove this option from your configuration.

frac-digits()
Type: number (digits of fractions of a second)
Default: Value of the global option (which defaults to 0)

Description: The syslog-ng application can store fractions of a second in the timestamps according to the ISO8601 format. The frac-digits() parameter specifies the number of digits stored. The digits storing the fractions are padded by zeros if the original timestamp of the message specifies only seconds. Fractions can always be stored for the time the message was received. Note that syslog-ng can add the fractions to non-ISO8601 timestamps as well.

fsync()
Type: yes or no
Default: no

Description: Forces an fsync() call on the destination fd after each write. Note: enabling this option may seriously degrade performance.

group()
Type: string
Default: Use the global settings

Description: Set the group of the created file to the one specified. To preserve the original properties of an existing file, use the option without specifying an attribute: group().

local-time-zone()
Type: name of the timezone, or the timezone offset
Default: The local timezone.

Description: Sets the timezone used when expanding filename and tablename templates.

The timezone can be specified as using the name of the (for example time-zone("Europe/Budapest")), or as the timezone offset in +/-HH:MM format (for example +01:00). On Linux and UNIX platforms, the valid timezone names are listed under the /usr/share/zoneinfo directory.

log-fifo-size()
Type: number (messages)
Default: Use global setting.

Description: The number of messages that the output queue can store.

mark-freq()
Accepted values: number (seconds)
Default: 1200

Description: An alias for the obsolete mark() option, retained for compatibility with syslog-ng version 1.6.x. The number of seconds between two MARK messages. MARK messages are generated when there was no message traffic to inform the receiver that the connection is still alive. If set to zero (0), no MARK messages are sent. The mark-freq() can be set for global option and/or every MARK capable destination driver if mark-mode() is periodical or dst-idle or host-idle. If mark-freq() is not defined in the destination, then the mark-freq() will be inherited from the global options. If the destination uses internal mark-mode(), then the global mark-freq() will be valid (does not matter what mark-freq() set in the destination side).

mark-mode()
Accepted values: internal | dst-idle | host-idle | periodical | none | global
Default:

internal for pipe, program drivers

none for file, unix-dgram, unix-stream drivers

global for syslog, tcp, udp destinations

host-idle for global option

Description: The mark-mode() option can be set for the following destination drivers: file(), program(), unix-dgram(), unix-stream(), network(), pipe(), syslog() and in global option.

  • internal: When internal mark mode is selected, internal source should be placed in the log path as this mode does not generate mark by itself at the destination. This mode only yields the mark messages from internal source. This is the mode as syslog-ng PE 3.x worked. MARK will be generated by internal source if there was NO traffic on local sources:

    file(), pipe(), unix-stream(), unix-dgram(), program()

  • dst-idle: Sends MARK signal if there was NO traffic on destination drivers. MARK signal from internal source will be dropped.

    MARK signal can be sent by the following destination drivers: network(), syslog(), program(), file(), pipe(), unix-stream(), unix-dgram().

  • host-idle: Sends MARK signal if there was NO local message on destination drivers. For example MARK is generated even if messages were received from tcp. MARK signal from internal source will be dropped.

    MARK signal can be sent by the following destination drivers: network(), syslog(), program(), file(), pipe(), unix-stream(), unix-dgram().

  • periodical: Sends MARK signal perodically, regardless of traffic on destination driver. MARK signal from internal source will be dropped.

    MARK signal can be sent by the following destination drivers: network(), syslog(), program(), file(), pipe(), unix-stream(), unix-dgram().

  • none: Destination driver drops all MARK messages. If an explicit mark-mode() is not given to the drivers where none is the default value, then none will be used.

  • global: Destination driver uses the global mark-mode() setting. The syslog-ng interprets syntax error if the global mark-mode() is global.

NOTE:

In case of dst-idle, host-idle and periodical, the MARK message will not be written in the destination, if it is not open yet.

Available in syslog-ng PE 4 LTS and later.

overwrite-if-older()
Type: number (seconds)
Default: 0

Description: If set to a value higher than 0, syslog-ng PE checks when the file was last modified before starting to write into the file. If the file is older than the specified amount of time (in seconds), then syslog-ng removes the existing file and opens a new file with the same name. In combination with for example the ${WEEKDAY} macro, this can be used for simple log rotation, in case not all history has to be kept. (Note that in this weekly log rotation example if its Monday 00:01, then the file from last Monday is not seven days old, because it was probably last modified shortly before 23:59 last Monday, so it is actually not even six days old. So in this case, set the overwrite-if-older() parameter to a-bit-less-than-six-days, for example, to 518000 seconds.

owner()
Type: string
Default: Use the global settings

Description: Set the owner of the created file to the one specified. To preserve the original properties of an existing file, use the option without specifying an attribute: owner().

pad-size()
Type: number (bytes)
Default: 0

Description: If set, syslog-ng PE will pad output messages to the specified size (in bytes). Some operating systems (such as HP-UX) pad all messages to block boundary. This option can be used to specify the block size. (HP-UX uses 2048 bytes).

Caution:

Hazard of data loss! If the size of the incoming message is larger than the previously set pad-size() value, syslog-ng will truncate the message to the specified size. Therefore, all message content above that size will be lost.

perm()
Type: number (octal notation)
Default: Use the global settings

Description: The permission mask of the file if it is created by syslog-ng. For octal numbers prefix the number with 0, for example use 0755 for rwxr-xr-x.

To preserve the original properties of an existing file, use the option without specifying an attribute: perm().

suppress()
Type: seconds
Default: 0 (disabled)

Description: If several identical log messages would be sent to the destination without any other messages between the identical messages (for example, an application repeated an error message ten times), syslog-ng can suppress the repeated messages and send the message only once, followed by the Last message repeated n times. message. The parameter of this option specifies the number of seconds syslog-ng waits for identical messages.

template()
Type: string
Default: A format conforming to the default logfile format.

Description: Specifies a template defining the logformat to be used in the destination. Macros are described in the section called “Macros of syslog-ng PE. Please note that for network destinations it might not be appropriate to change the template as it changes the on-wire format of the syslog protocol which might not be tolerated by stock syslog receivers (like syslogd or syslog-ng itself). For network destinations make sure the receiver can cope with the custom format defined.

template-escape()
Type: yes or no
Default: no

Description: Turns on escaping for the ', ", and backspace characters in templated output files. This is useful for generating SQL statements and quoting string contents so that parts of the log message are not interpreted as commands to the SQL server.

time-zone()
Type: name of the timezone, or the timezone offset
Default: unspecified

Description: Convert timestamps to the timezone specified by this option. If this option is not set, then the original timezone information in the message is used. Converting the timezone changes the values of all date-related macros derived from the timestamp, for example, HOUR. For the complete list of such macros, see the section called “Date-related macros”.

The timezone can be specified as using the name of the (for example time-zone("Europe/Budapest")), or as the timezone offset in +/-HH:MM format (for example +01:00). On Linux and UNIX platforms, the valid timezone names are listed under the /usr/share/zoneinfo directory.

ts-format()
Type: rfc3164, bsd, rfc3339, iso
Default: Use the global option (which defaults to rfc3164)

Description: Override the global timestamp format (set in the global ts-format() parameter) for the specific destination. For details, see the section called “A note on timezones and timestamps”.

Storing messages on the Hadoop Distributed File System (HDFS)

Starting with version 5.3, syslog-ng PE can send plain-text log files to the Hadoop Distributed File System (HDFS), allowing you to store your log data on a distributed, scalable file system. This is especially useful if you have huge amount of log messages that would be difficult to store otherwise, or if you want to process your messages using Hadoop tools (for example, Apache Pig).

NOTE:

In order to use this destination, syslog-ng Premium Edition must run in server mode. Typically, only the central syslog-ng Premium Edition server uses this destination. For details on the server mode, see the section called “Server mode”.

Note the following limitations when using the syslog-ng PEhdfs destination:

  • This destination is only supported on the Linux platforms that use the linux glibc2.11 installer, including: Red Hat ES 7, Ubuntu 14.04 (Trusty Tahr).

  • Since syslog-ng PE uses the official Java HDFS client, the hdfs destination has significant memory usage (about 400MB).

  • The syslog-ng PE application always creates a new file if the previous has been closed. Appending data to existing files is not supported.

  • Macros are not supported in the file path and the filename. You can use only simple file paths for your log files, for example, /usr/hadoop/logfile.txt.

  • You cannot set when log messages are flushed. Hadoop performs this action automatically, depending on its configured block size, and the amount of data received. There is no way for the syslog-ng PE application to influence when the messages are actually written to disk. This means that syslog-ng PE cannot guarantee that a message sent to HDFS is actually written to disk. When using flow-control or RLTP, syslog-ng PE acknowledges a message as written to disk when it passes the message to the HDFS client. This method is as reliable as your HDFS environment.

  • The log messages of the underlying client libraries are available in the internal() source of syslog-ng PE.

NOTE:

The hdfs destination has been tested with Hortonworks Data Platform.

Declaration: 

@module mod-java
@include "scl.conf"

hdfs(
    client_lib_dir("/opt/syslog-ng/lib/syslog-ng/java-modules/:<path-to-preinstalled-hadoop-libraries>")
    hdfs_uri("hdfs://NameNode:8020")
    hdfs_file("<path-to-logfile>")
);

Example 7.10. Storing logfiles on HDFS

The following example defines an hdfs destination using only the required parameters.

@module mod-java
@include "scl.conf"

destination d_hdfs {
    hdfs(
        client_lib_dir("/opt/syslog-ng/lib/syslog-ng/java-modules/:/opt/hadoop/libs")
        hdfs_uri("hdfs://10.140.32.80:8020")
        hdfs_file("/user/log/logfile.txt")
    );
};

Procedure 7.3. Prerequisites

To send messages from syslog-ng PE to HDFS, complete the following steps.

Steps: 

  1. If you want to use the Java-based modules of syslog-ng PE (for example, the Elasticsearch, HDFS, or Kafka destinations), download and install the Java Runtime Environment (JRE), 1.7 (or newer).

    The Java-based modules of syslog-ng PE are tested and supported when using the Oracle implementation of Java. Other implementations are untested and unsupported, they may or may not work as expected.

  2. Download the Hadoop Distributed File System (HDFS) libraries (version 2.x) from http://hadoop.apache.org/releases.html.

  3. Extract the HDFS libraries into a temporary directory, then collect the various .jar files into a single directory (for example, /opt/hadoop/lib/) where syslog-ng PE can access them. You must specify this directory in the syslog-ng PE configuration file. The files are located in the various lib directories under the share/ directory of the Hadoop release package. (For example, in Hadoop 2.7, required files are common/hadoop-common-2.7.0.jar, common/libs/*.jar, hdfs/hadoop-hdfs-2.7.0.jar, hdfs/lib/*, but this may change between Hadoop releases, so it is easier to copy every .jar file into a single directory.

Procedure 7.4. How syslog-ng PE interacts with HDFS

The syslog-ng PE application sends the log messages to the official HDFS client library, which forwards the data to the HDFS nodes. The way how syslog-ng PE interacts with HDFS is described in the following steps.

  1. After syslog-ng PE is started and the first message arrives to the hdfs destination, the hdfs destination tries to connect to the HDFS NameNode. If the connection fails, syslog-ng PE will repeatedly attempt to connect again after the period set in time-reopen() expires.

  2. syslog-ng PE checks if the path to the logfile exists. If a directory does not exist syslog-ng PE automatically creates it. syslog-ng PE creates the destination file (using the filename set in the syslog-ng PE configuration file, with a UUID suffix to make it unique, for example, /usr/hadoop/logfile.txt.3dc1c59e-ab3b-4b71-9e81-93db477ed9d9) and writes the message into the file. After the file is created, syslog-ng PE will write all incoming messages into the hdfs destination.

    NOTE:

    You cannot set when log messages are flushed. Hadoop performs this action automatically, depending on its configured block size, and the amount of data received. There is no way for the syslog-ng PE application to influence when the messages are actually written to disk. This means that syslog-ng PE cannot guarantee that a message sent to HDFS is actually written to disk. When using flow-control or RLTP, syslog-ng PE acknowledges a message as written to disk when it passes the message to the HDFS client. This method is as reliable as your HDFS environment.

  3. If the HDFS client returns an error, syslog-ng PE attempts to close the file, then opens a new file and repeats sending the message (trying to connect to HDFS and send the message), as set in the retries() parameter. If sending the message fails for retries() times, syslog-ng PE drops the message.

  4. The syslog-ng PE application closes the destination file in the following cases:

    • syslog-ng PE is reloaded

    • syslog-ng PE is restarted

    • The HDFS client returns an error.

  5. If the file is closed and you have set an archive directory, syslog-ng PE moves the file to this directory. If syslog-ng PE cannot move the file for some reason (for example, syslog-ng PE cannot connect to the HDFS NameNode), the file remains at its original location, syslog-ng PE will not try to move it again.

Procedure 7.5. Storing messages with MapR-FS

The syslog-ng PE application is also compatible with MapR File System (MapR-FS), starting from version 5.4, syslog-ng Premium Edition is MapR certified. MapR-FS provides better performance, reliability, efficiency, maintainability, and ease of use compared to the default Hadoop Distributed Files System (HDFS). To use MapR-FS with syslog-ng PE, complete the following steps:

  1. Install MapR libraries. Instead of the official Apache HDFS libraries, MapR uses different libraries. The supported version is MapR 4.x.

    1. Download the libraries from the Maven Repository and Artifacts for MapR or get it from an already existing MapR installation.

    2. Install MapR. If you do not know how to install MapR, follow the instructions on the MapR website.

  2. In a default MapR installation, the required libraries are installed in the following path: /opt/mapr/lib.

    Enter the path where MapR was installed in the class-path option of the hdfs destination, for example:

    class_path("/opt/mapr/lib/")

    If the libraries were downloaded from the Maven Repository, the following additional libraries will be requiered. Note that the version numbers in the filenames can be different in the various Hadoop releases:commons-collections-3.2.1.jar, commons-logging-1.1.3.jar, hadoop-auth-2.5.1.jar, log4j-1.2.15.jar, slf4j-api-1.7.5.jar, commons-configuration-1.6.jar, guava-13.0.1.jar, hadoop-common-2.5.1.jar, maprfs-4.0.2-mapr.jar, slf4j-log4j12-1.7.5.jar, commons-lang-2.5.jar, hadoop-0.20.2-dev-core.jar, json-20080701.jar, protobuf-java-2.5.0.jar, zookeeper-3.4.5-mapr-1406.jar.

  3. Configure the hdfs destination in syslog-ng PE.

    Example 7.11. Storing logfiles with MapR-FS

    The following example defines an hdfs destination for MapR-FS using only the required parameters.

    @module mod-java
    @include "scl.conf"
    
    destination d_mapr {
        hdfs(
            client_lib_dir("/opt/syslog-ng/lib/syslog-ng/java-modules/:/opt/mapr/lib/")
            hdfs_uri(maprfs://10.140.32.80")
            hdfs_file("/user/log/logfile.txt")
        );
    };
    

HDSF destination options

The hdfs destination stores the log messages in files on the Hadoop Distributed File System (HDFS). The hdfs destination has the following options.

The following options are required: hdfs_file(), hdfs_uri(). Note that to use hdfs, you must add the following lines to the beginning of your syslog-ng PE configuration:

@module mod-java
@include "scl.conf"
client_lib_dir()
Type: string
Default: N/A

Description: Include the path to the directory where you copied the required libraries (see Procedure 7.3, “Prerequisites”), for example, client_lib_dir(/user/share/hadoop/lib).

disk-buffer()

Description: This option enables putting outgoing messages into the disk buffer of the destination to avoid message loss in case of a system failure on the destination side. It has the following options:

reliable()
Type: yes|no
Default: no

Description: If set to yes, syslog-ng PE cannot lose logs in case of reload/restart, unreachable destination or syslog-ng PE crash. This solution provides a slower, but reliable disk-buffer option. It is created and initialized at startup and gradually grows as new messages arrive. If set to no, the normal disk-buffer will be used. This provides a faster, but less reliable disk-buffer option.

Caution:

Hazard of data loss! If you change the value of reliable() option when there are messages in the disk-buffer, the messages stored in the disk-buffer will be lost.

dir()
Type: string
Default: N/A

Description: Defines the folder where the disk-buffer files are stored. This option has priority over --qdisk-dir=.

Caution:

When creating a new dir() option for a disk buffer, or modifying an existing one, make sure you delete the persist file, or at least remove the relevant persist-entry.

syslog-ng PE creates disk-buffer files based on the path recorded in the persist file. Therefore, if the persist file or the relevant entry is not deleted after modifying the dir() option, then following a restart, syslog-ng PE will look for or create disk-buffer files in their old location. To ensure that syslog-ng PE uses the new dir() setting, the persist file must not contain any information about the destinations which the disk-buffer file in question belongs to.

disk-buf-size()
Type: number (bytes)
Default:
Description: This is a required option. The maximum size of the disk-buffer in bytes. The minimum value is 1048576 bytes. If you set a smaller value, the minimum value will be used automatically. It replaces the old log-disk-fifo-size() option.
mem-buf-length()
Type: number (messages)
Default: 10000
Description: Use this option if the option reliable() is set to no. This option contains the number of messages stored in overflow queue. It replaces the old log-fifo-size() option. It inherits the value of the global log-fifo-size() option if provided. If it is not provided, the default value is 10000 messages. Note that this option will be ignored if the option reliable() is set to yes.
mem-buf-size()
Type: number (bytes)
Default: 163840000
Description: Use this option if the option reliable() is set to yes. This option contains the size of the messages in bytes that is used in the memory part of the disk buffer. It replaces the old log-fifo-size() option. It does not inherit the value of the global log-fifo-size() option, even if it is provided. Note that this option will be ignored if the option reliable() is set to no.
quot-size()
Type: number (messages)
Default: 64
Description: The number of messages stored in the output buffer of the destination.

Options reliable() and disk-buf-size() are required options.

Example 7.12. Examples for using disk-buffer()

In the following case reliable disk-buffer() is used.

destination d_demo {
    network(
            "127.0.0.1"
            port(3333)
            disk-buffer(
                mem-buf-size(10000)
                disk-buf-size(2000000)
                reliable(yes)
                dir("/tmp/disk-buffer")
            )
        );
};

In the following case normal disk-buffer() is used.

destination d_demo {
    network(
            "127.0.0.1"
            port(3333)
            disk-buffer(
                mem-buf-length(10000)
                disk-buf-size(2000000)
                reliable(no)
                dir("/tmp/disk-buffer")
            )
        );
};

frac-digits()
Type: number (digits of fractions of a second)
Default: Value of the global option (which defaults to 0)

Description: The syslog-ng application can store fractions of a second in the timestamps according to the ISO8601 format. The frac-digits() parameter specifies the number of digits stored. The digits storing the fractions are padded by zeros if the original timestamp of the message specifies only seconds. Fractions can always be stored for the time the message was received. Note that syslog-ng can add the fractions to non-ISO8601 timestamps as well.

hdfs_archive_dir()
Type: string
Default: N/A

Description: The path where syslog-ng PE will move the closed log files. If syslog-ng PE cannot move the file for some reason (for example, syslog-ng PE cannot connect to the HDFS NameNode), the file remains at its original location. For example, hdfs_archive_dir("/usr/hdfs/archive/").

hdfs_file()
Type: string
Default: N/A

Description: The path and name of the log file. For example, hdfs_file("/usr/hdfs/mylogfile.txt"). syslog-ng PE checks if the path to the logfile exists. If a directory does not exist syslog-ng PE automatically creates it.

Macros are not supported in the file path and the filename. You can use only simple file paths for your log files, for example, /usr/hadoop/logfile.txt.

hdfs_max_filename_length()
Type: number
Default: 255

Description: The maximum length of the filename. This filename (including the UUID that syslog-ng PE appends to it) cannot be longer than what the file system permits. If the filename is longer than the value of hdfs_max_filename_length, syslog-ng PE will automatically truncate the filename. For example, hdfs_max_filename_length("255").

hdfs_resources()
Type: string
Default: N/A

Description: The list of Hadoop resources to load, separated by semicolons. For example, hdfs_resources("/home/user/hadoop/core-site.xml;/home/user/hadoop/hdfs-site.xml").

hdfs_uri()
Type: string
Default: N/A

Description: The URI of the HDFS NameNode is in hdfs://IPaddress:port or hdfs://hostname:port format. When using MapR-FS, the URI of the MapR-FS NameNode is in maprfs://IPaddress or maprfs://hostname format, for example: maprfs://10.140.32.80. The IP address of the node can be IPv4 or IPv6. For example, hdfs_uri("hdfs://10.140.32.80:8020"). The IPv6 address must be enclosed in square brackets ([]) as specified by RFC 2732, for example, hdfs_uri("hdfs://[FEDC:BA98:7654:3210:FEDC:BA98:7654:3210]:8020").

log-fifo-size()
Type: number (messages)
Default: Use global setting.

Description: The number of messages that the output queue can store.

on-error()
Accepted values: drop-message|drop-property|fallback-to-string|silently-drop-message|silently-drop-property|silently-fallback-to-string
Default: Use the global setting (which defaults to drop-message)

Description: Controls what happens when type-casting fails and syslog-ng PE cannot convert some data to the specified type. By default, syslog-ng PE drops the entire message and logs the error. Currently the value-pairs() option uses the settings of on-error().

  • drop-message: Drop the entire message and log an error message to the internal() source. This is the default behavior of syslog-ng PE.

  • drop-property: Omit the affected property (macro, template, or message-field) from the log message and log an error message to the internal() source.

  • fallback-to-string: Convert the property to string and log an error message to the internal() source.

  • silently-drop-message: Drop the entire message silently, without logging the error.

  • silently-drop-property: Omit the affected property (macro, template, or message-field) silently, without logging the error.

  • silently-fallback-to-string: Convert the property to string silently, without logging the error.

retries()
Type: number (of attempts)
Default: 3

Description: The number of times syslog-ng PE attempts to send a message to this destination. If syslog-ng PE could not send a message, it will try again until the number of attempts reaches retries, then drops the message.

template()
Type: string
Default: A format conforming to the default logfile format.

Description: Specifies a template defining the logformat to be used in the destination. Macros are described in the section called “Macros of syslog-ng PE. Please note that for network destinations it might not be appropriate to change the template as it changes the on-wire format of the syslog protocol which might not be tolerated by stock syslog receivers (like syslogd or syslog-ng itself). For network destinations make sure the receiver can cope with the custom format defined.

throttle()
Type: number (messages per second)
Default: 0

Description: Sets the maximum number of messages sent to the destination per second. Use this output-rate-limiting functionality only when using disk-buffer as well to avoid the risk of losing messages. Specifying 0 or a lower value sets the output limit to unlimited.

time-zone()
Type: name of the timezone, or the timezone offset
Default: unspecified

Description: Convert timestamps to the timezone specified by this option. If this option is not set, then the original timezone information in the message is used. Converting the timezone changes the values of all date-related macros derived from the timestamp, for example, HOUR. For the complete list of such macros, see the section called “Date-related macros”.

The timezone can be specified as using the name of the (for example time-zone("Europe/Budapest")), or as the timezone offset in +/-HH:MM format (for example +01:00). On Linux and UNIX platforms, the valid timezone names are listed under the /usr/share/zoneinfo directory.

ts-format()
Type: rfc3164, bsd, rfc3339, iso
Default: Use the global option (which defaults to rfc3164)

Description: Override the global timestamp format (set in the global ts-format() parameter) for the specific destination. For details, see the section called “A note on timezones and timestamps”.

Publishing messages to Apache Kafka

Starting with version 5.4, syslog-ng PE can directly publish log messages to the Apache Kafka message bus, where subscribers can access them.

NOTE:

In order to use this destination, syslog-ng Premium Edition must run in server mode. Typically, only the central syslog-ng Premium Edition server uses this destination. For details on the server mode, see the section called “Server mode”.

  • This destination is only supported on the Linux platforms that use the linux glibc2.11 installer, including: Red Hat ES 7, Ubuntu 14.04 (Trusty Tahr).

  • Since syslog-ng PE uses the official Java Kafka producer, the kafka destination has significant memory usage.

  • The log messages of the underlying client libraries are available in the internal() source of syslog-ng PE.

Declaration: 

@module mod-java
@include "scl.conf"

kafka(
    client_lib_dir(/opt/syslog-ng/lib/syslog-ng/java-modules/:<path-to-preinstalled-kafka-libraries>)
    kafka_bootstrap_servers("1.2.3.4:9092,192.168.0.2:9092")
    topic("${HOST}")

);

Example 7.13. Sending log data to Apache Kafka

The following example defines a kafka destination, using only the required parameters.

@module mod-java
@include "scl.conf"

destination d_kafka {
  kafka(
    client_lib_dir(/opt/syslog-ng/lib/syslog-ng/java-modules/KafkaDestination.jar:/usr/share/kafka/lib)
    kafka_bootstrap_servers("1.2.3.4:9092,192.168.0.2:9092")
    topic("${HOST}")
  );
};

Procedure 7.6. Prerequisites

To publish messages from syslog-ng PE to Apache Kafka, complete the following steps.

Steps: 

  1. If you want to use the Java-based modules of syslog-ng PE (for example, the Elasticsearch, HDFS, or Kafka destinations), download and install the Java Runtime Environment (JRE), 1.7 (or newer).

    The Java-based modules of syslog-ng PE are tested and supported when using the Oracle implementation of Java. Other implementations are untested and unsupported, they may or may not work as expected.

  2. Download the latest stable binary release of the Apache Kafka libraries (version 0.8.2 or newer) from http://kafka.apache.org/downloads.html.

  3. Extract the Apache Kafka libraries into a single directory. If needed, collect the various .jar files into a single directory (for example, /opt/kafka/lib/) where syslog-ng PE can access them. You must specify this directory in the syslog-ng PE configuration file.

  4. Check if the following files in the Kafka libraries have the same version number: slf4j-api-<version-number>.jar, slf4j-log4j12-<version-number>.jar. If the version number of these files is different, complete the following steps:

    1. Delete one of the files (for example, slf4j-log4j12-<version-number>.jar).

    2. Download a version that matches the version number of the other file (for example, 1.7.6) from the official SLF4J distribution.

    3. Copy the downloaded file into the directory of your Kafka library files (for example, /opt/kafka/lib/).

How syslog-ng PE interacts with Apache Kafka

When stopping the syslog-ng PE application, syslog-ng PE will not stop until all Java threads are finished, including the threads started by the Kafka Producer. There is no way (except for the kill -9 command) to stop syslog-ng PE before the Kafka Producer stops. To change this behavior set the properties of the Kafka Producer in its properties file, and reference the file in the properties-file option.

The syslog-ng PEkafka destination tries to reconnect to the brokers in a tight loop. This can look as spinning, because of a lot of similar debug messages. To decrease the amount of such messages, set a bigger timeout using the following properties:

retry.backoff.ms=1000
reconnect.backoff.ms=1000

For details on using property files, see the section called “properties_file()”. For details on the properties that you can set in the property file, see the Apache Kafka documentation.

Kafka destination options

The kafka destination of syslog-ng PE can directly publish log messages to the Apache Kafka message bus, where subscribers can access them. The kafka destination has the following options.

Required options: 

The following options are required: kafka-bootstrap-servers(), topic(). Note that to use kafka, you must add the following lines to the beginning of your syslog-ng PE configuration:

@module mod-java
@include "scl.conf"
client_lib_dir()
Type: string
Default: N/A

Description: Include the path to the directory where you copied the required libraries (see Procedure 7.6, “Prerequisites”), for example, client_lib_dir(/user/share/kafka/lib).

disk-buffer()

Description: This option enables putting outgoing messages into the disk buffer of the destination to avoid message loss in case of a system failure on the destination side. It has the following options:

reliable()
Type: yes|no
Default: no

Description: If set to yes, syslog-ng PE cannot lose logs in case of reload/restart, unreachable destination or syslog-ng PE crash. This solution provides a slower, but reliable disk-buffer option. It is created and initialized at startup and gradually grows as new messages arrive. If set to no, the normal disk-buffer will be used. This provides a faster, but less reliable disk-buffer option.

Caution:

Hazard of data loss! If you change the value of reliable() option when there are messages in the disk-buffer, the messages stored in the disk-buffer will be lost.

dir()
Type: string
Default: N/A

Description: Defines the folder where the disk-buffer files are stored. This option has priority over --qdisk-dir=.

Caution:

When creating a new dir() option for a disk buffer, or modifying an existing one, make sure you delete the persist file, or at least remove the relevant persist-entry.

syslog-ng PE creates disk-buffer files based on the path recorded in the persist file. Therefore, if the persist file or the relevant entry is not deleted after modifying the dir() option, then following a restart, syslog-ng PE will look for or create disk-buffer files in their old location. To ensure that syslog-ng PE uses the new dir() setting, the persist file must not contain any information about the destinations which the disk-buffer file in question belongs to.

disk-buf-size()
Type: number (bytes)
Default:
Description: This is a required option. The maximum size of the disk-buffer in bytes. The minimum value is 1048576 bytes. If you set a smaller value, the minimum value will be used automatically. It replaces the old log-disk-fifo-size() option.
mem-buf-length()
Type: number (messages)
Default: 10000
Description: Use this option if the option reliable() is set to no. This option contains the number of messages stored in overflow queue. It replaces the old log-fifo-size() option. It inherits the value of the global log-fifo-size() option if provided. If it is not provided, the default value is 10000 messages. Note that this option will be ignored if the option reliable() is set to yes.
mem-buf-size()
Type: number (bytes)
Default: 163840000
Description: Use this option if the option reliable() is set to yes. This option contains the size of the messages in bytes that is used in the memory part of the disk buffer. It replaces the old log-fifo-size() option. It does not inherit the value of the global log-fifo-size() option, even if it is provided. Note that this option will be ignored if the option reliable() is set to no.
quot-size()
Type: number (messages)
Default: 64
Description: The number of messages stored in the output buffer of the destination.

Options reliable() and disk-buf-size() are required options.

Example 7.14. Examples for using disk-buffer()

In the following case reliable disk-buffer() is used.

destination d_demo {
    network(
            "127.0.0.1"
            port(3333)
            disk-buffer(
                mem-buf-size(10000)
                disk-buf-size(2000000)
                reliable(yes)
                dir("/tmp/disk-buffer")
            )
        );
};

In the following case normal disk-buffer() is used.

destination d_demo {
    network(
            "127.0.0.1"
            port(3333)
            disk-buffer(
                mem-buf-length(10000)
                disk-buf-size(2000000)
                reliable(no)
                dir("/tmp/disk-buffer")
            )
        );
};

kafka_bootstrap_servers()
Type: list of hostnames
Default:

Description: Specifies the hostname or IP address of the Kafka server. When specifying an IP address, IPv4 (for example, 192.168.0.1) or IPv6 (for example, [::1]) can be used as well. Use a colon (:) after the address to specify the port number of the server. When specifying multiple addresses, use a comma to separate the addresses, for example, kafka-bootstrap-servers("127.0.0.1:2525,remote-server-hostname:6464")

frac-digits()
Type: number (digits of fractions of a second)
Default: Value of the global option (which defaults to 0)

Description: The syslog-ng application can store fractions of a second in the timestamps according to the ISO8601 format. The frac-digits() parameter specifies the number of digits stored. The digits storing the fractions are padded by zeros if the original timestamp of the message specifies only seconds. Fractions can always be stored for the time the message was received. Note that syslog-ng can add the fractions to non-ISO8601 timestamps as well.

on-error()
Accepted values: drop-message|drop-property|fallback-to-string|silently-drop-message|silently-drop-property|silently-fallback-to-string
Default: Use the global setting (which defaults to drop-message)

Description: Controls what happens when type-casting fails and syslog-ng PE cannot convert some data to the specified type. By default, syslog-ng PE drops the entire message and logs the error. Currently the value-pairs() option uses the settings of on-error().

  • drop-message: Drop the entire message and log an error message to the internal() source. This is the default behavior of syslog-ng PE.

  • drop-property: Omit the affected property (macro, template, or message-field) from the log message and log an error message to the internal() source.

  • fallback-to-string: Convert the property to string and log an error message to the internal() source.

  • silently-drop-message: Drop the entire message silently, without logging the error.

  • silently-drop-property: Omit the affected property (macro, template, or message-field) silently, without logging the error.

  • silently-fallback-to-string: Convert the property to string silently, without logging the error.

key()
Type: template
Default: N/A

Description: The key of the partition under which the message is published. You can use templates to change the topic dynamically based on the source or the content of the message, for example, key("${PROGRAM}").

log-fifo-size()
Type: number (messages)
Default: Use global setting.

Description: The number of messages that the output queue can store.

properties_file()
Type: string (absolute path)
Default: N/A

Description: The absolute path and filename of the Kafka properties file to load. For example, properties_file("/opt/syslog-ng/etc/kafka_dest.properties"). The syslog-ng PE application reads this file and passes the properties to the Kafka Producer. If a property is defined both in the syslog-ng PE configuration file (syslog-ng.conf) and in the properties file, then syslog-ng PE uses the definition from the syslog-ng PE configuration file.

The syslog-ng PEkafka destination supports all properties of the official Kafka producer. For details, see the Apache Kafka documentation.

The kafka-bootstrap-servers option is translated to the bootstrap.servers property.

For example, the following properties file defines the acknowledgement method and compression:

acks=all
compression.type=snappy
retries()
Type: number (of attempts)
Default: 3

Description: The number of times syslog-ng PE attempts to send a message to this destination. If syslog-ng PE could not send a message, it will try again until the number of attempts reaches retries, then drops the message.

template()
Type: template or template function
Default: $ISODATE $HOST $MSGHDR$MSG\n

Description: The message as published to Apache Kafka. You can use templates and template functions (for example, format-json()) to format the message, for example, template("$(format-json --scope rfc5424 --exclude DATE --key ISODATE)").

For details on formatting messages in JSON format, see the section called “format-json”.

throttle()
Type: number (messages per second)
Default: 0

Description: Sets the maximum number of messages sent to the destination per second. Use this output-rate-limiting functionality only when using disk-buffer as well to avoid the risk of losing messages. Specifying 0 or a lower value sets the output limit to unlimited.

topic()
Type: template
Default: N/A

Description: The Kafka topic under which the message is published. You can use templates to change the topic dynamically based on the source or the content of the message, for example, topic("${HOST}").

sync_send()
Type: true | false
Default: false

Description: When sync_send is set to true, syslog-ng PE sends the message reliably: it sends a message to the Kafka server, then waits for a reply. In case of failure, syslog-ng PE repeats sending the message, as set in the retries() parameter. If sending the message fails for retries() times, syslog-ng PE drops the message.

This method ensures reliable message transfer, but is very slow.

When sync_send is set to false, syslog-ng PE sends messages asynchronously, and receives the response asynchronously. In case of a problem, syslog-ng PE cannot resend the messages.

This method is fast, but the transfer is not reliable. Several thousands of messages can be lost before syslog-ng PE recognizes the error.

time-zone()
Type: name of the timezone, or the timezone offset
Default: unspecified

Description: Convert timestamps to the timezone specified by this option. If this option is not set, then the original timezone information in the message is used. Converting the timezone changes the values of all date-related macros derived from the timestamp, for example, HOUR. For the complete list of such macros, see the section called “Date-related macros”.

The timezone can be specified as using the name of the (for example time-zone("Europe/Budapest")), or as the timezone offset in +/-HH:MM format (for example +01:00). On Linux and UNIX platforms, the valid timezone names are listed under the /usr/share/zoneinfo directory.

ts-format()
Type: rfc3164, bsd, rfc3339, iso
Default: Use the global option (which defaults to rfc3164)

Description: Override the global timestamp format (set in the global ts-format() parameter) for the specific destination. For details, see the section called “A note on timezones and timestamps”.

Related Documents