Chat now with support
Chat with Support

syslog-ng Premium Edition 7.0.33 - Administration Guide

Preface Introduction to syslog-ng The concepts of syslog-ng Installing syslog-ng PE The syslog-ng PE quick-start guide The syslog-ng PE configuration file Collecting log messages — sources and source drivers
How sources work default-network-drivers: Receive and parse common syslog messages internal: Collecting internal messages file: Collecting messages from text files google-pubsub: collecting messages from the Google Pub/Sub messaging service wildcard-file: Collecting messages from multiple text files linux-audit: Collecting messages from Linux audit logs mssql, oracle, sql: collecting messages from an SQL database network: Collecting messages using the RFC3164 protocol (network() driver) office365: Fetching logs from Office 365 osquery: Collect and parse osquery result logs pipe: Collecting messages from named pipes program: Receiving messages from external applications python: writing server-style Python sources python-fetcher: writing fetcher-style Python sources snmptrap: Read Net-SNMP traps syslog: Collecting messages using the IETF syslog protocol (syslog() driver) system: Collecting the system-specific log messages of a platform systemd-journal: Collecting messages from the systemd-journal system log storage systemd-syslog: Collecting systemd messages using a socket tcp, tcp6,udp, udp6: Collecting messages from remote hosts using the BSD syslog protocol udp-balancer: Receiving UDP messages at very high rate unix-stream, unix-dgram: Collecting messages from UNIX domain sockets windowsevent: Collecting Windows event logs
Sending and storing log messages — destinations and destination drivers
elasticsearch2>: Sending messages directly to Elasticsearch version 2.0 or higher (DEPRECATED) elasticsearch-http: Sending messages to Elasticsearch HTTP Event Collector file: Storing messages in plain-text files google_pubsub(): Sending logs to the Google Cloud Pub/Sub messaging service google_pubsub-managedaccount(): Sending logs to the Google Cloud Pub/Sub messaging service authenticated by Google Cloud managed service account hdfs: Storing messages on the Hadoop Distributed File System (HDFS) http: Posting messages over HTTP kafka(): Publishing messages to Apache Kafka (Java implementation) (DEPRECATED) kafka-c(): Publishing messages to Apache Kafka using the librdkafka client (C implementation) logstore: Storing messages in encrypted files mongodb: Storing messages in a MongoDB database network: Sending messages to a remote log server using the RFC3164 protocol (network() driver) pipe: Sending messages to named pipes program: Sending messages to external applications python: writing custom Python destinations sentinel(): Sending logs to the Microsoft Azure Sentinel cloud snmp: Sending SNMP traps smtp: Generating SMTP messages (email) from logs splunk-hec: Sending messages to Splunk HTTP Event Collector sql(): Storing messages in an SQL database stackdriver: Sending logs to the Google Stackdriver cloud syslog: Sending messages to a remote logserver using the IETF-syslog protocol syslog-ng(): Forward logs to another syslog-ng node tcp, tcp6, udp, udp6: Sending messages to a remote log server using the legacy BSD-syslog protocol (tcp(), udp() drivers) unix-stream, unix-dgram: Sending messages to UNIX domain sockets usertty: Sending messages to a user terminal — usertty() destination Client-side failover
Routing messages: log paths, flags, and filters Global options of syslog-ng PE TLS-encrypted message transfer Advanced Log Transport Protocol Reliability and minimizing the loss of log messages Manipulating messages parser: Parse and segment structured messages Processing message content with a pattern database Correlating log messages Enriching log messages with external data Monitoring statistics and metrics of syslog-ng Multithreading and scaling in syslog-ng PE Troubleshooting syslog-ng Best practices and examples The syslog-ng manual pages Glossary

Kerberos authentication with syslog-ng hdfs() destination

Version 7.0.3 and later supports Kerberos authentication to authenticate the connection to your Hadoop cluster. syslog-ng PE assumes that you already have a Hadoop and Kerberos infrastructure.

NOTE: If you configure Kerberos authentication for a hdfs() destination, it affects all hdfs() destinations. Kerberos and non-Kerberos hdfs() destinations cannot be mixed in a syslog-ng PE configuration. This means that if one hdfs() destination uses Kerberos authentication, you have to configure all other hdfs() destinations to use Kerberos authentication too.

Failing to do so results in non-Kerberos hdfs() destinations being unable to authenticate to the HDFS server.

NOTE: If you want to configure your hdfs() destination to stop using Kerberos authentication, namely, to remove Kerberos-related options from the hdfs() destination configuration, make sure to restart syslog-ng PE for the changes to take effect.

Prerequisites
  • You have configured your Hadoop infrastructure to use Kerberos authentication.

  • You have a keytab file and a principal for the host running syslog-ng PE.

  • You have installed and configured the Kerberos client packages on the host running syslog-ng PE. (That is, Kerberos authentication works for the host, for example, from the command line using the kinit user@REALM -k -t <keytab_file> command.)

destination d_hdfs {
    hdfs(
        client-lib-dir("/hdfs-libs/lib")
        hdfs-uri("hdfs://hdp-kerberos.syslog-ng.example:8020")
        kerberos-keytab-file("/opt/syslog-ng/etc/hdfs.headless.keytab")
        kerberos-principal("hdfs-hdpkerberos@MYREALM")
        hdfs-file("/var/hdfs/test.log")
    );
};

HDFS destination options

The hdfs destination stores the log messages in files on the Hadoop Distributed File System (HDFS). The hdfs destination has the following options.

The following options are required: hdfs-file(), hdfs-uri(). Note that to use hdfs, you must add the following lines to the beginning of your syslog-ng PE configuration:

@module mod-java
@include "scl.conf"
client-lib-dir()
Type: string
Default: The syslog-ng PE module directory: /opt/syslog-ng/lib/syslog-ng/java-modules/

Description: The list of the paths where the required Java classes are located. For example, class-path("/opt/syslog-ng/lib/syslog-ng/java-modules/:/opt/my-java-libraries/libs/"). If you set this option multiple times in your syslog-ng PE configuration (for example, because you have multiple Java-based destinations), syslog-ng PE will merge every available paths to a single list.

For the hdfs destination, include the path to the directory where you copied the required libraries (see Prerequisites), for example, client-lib-dir("/opt/syslog-ng/lib/syslog-ng/java-modules/:/opt/hadoop/libs/").

disk-buffer()

Description: This option enables putting outgoing messages into the disk-buffer file of the destination to avoid message loss in case of a system failure on the destination side. It has the following options:

dir()
Type: string
Default: N/A

Description: Defines the folder where the disk-buffer files are stored.

Note that changing the value the dir() option will not move or copy existing files from the old directory to the new one.

Caution:

When creating a new dir() option for a disk-buffer file, or modifying an existing one, make sure you delete the persist file.

syslog-ng PE creates disk-buffer files based on the path recorded in the persist file. Therefore, if the persist file is not deleted after modifying the dir() option, then following a restart, syslog-ng PE will look for or create disk-buffer files in their old location. To ensure that syslog-ng PE uses the new dir() setting, the persist file must not contain any information about the destinations which the disk-buffer file in question belongs to.

disk-buf-size()
Type: number [bytes]
Default:

Description: This is a required option. The maximum size of the disk-buffer file in bytes. The minimum value is 1048576 bytes. If you set a smaller value, the minimum value will be used automatically. It replaces the old log-disk-fifo-size() option.
mem-buf-length()
Type: number [messages]
Default: 10000
Description: Use this option if the option reliable() is set to no. This option contains the number of messages stored in overflow queue. It replaces the old log-fifo-size() option. It inherits the value of the global log-fifo-size() option if provided. If it is not provided, the default value is 10000 messages. Note that this option will be ignored if the option reliable() is set to yes.
mem-buf-size()
Type: number [bytes]
Default: 163840000
Description: Use this option if the option reliable() is set to yes. This option contains the size of the messages in bytes that is used in the memory part of the disk-buffer file. It replaces the old log-fifo-size() option. It does not inherit the value of the global log-fifo-size() option, even if it is provided. Note that this option will be ignored if the option reliable() is set to no.
quot-size()
Type: number [messages]
Default: 1000

Description: The number of messages stored in the output buffer of the destination. Note that if you change the value of this option and the disk-buffer file already exists, the change will take effect when the disk-buffer file becomes empty.

reliable()
Type: yes|no
Default: no

Description: If set to yes, syslog-ng PE cannot lose logs in case of reload/restart, unreachable destination or syslog-ng PE crash. This solution provides a slower, but reliable disk-buffer option. It is created and initialized at startup and gradually grows as new messages arrive. If set to no, the normal disk-buffer option will be used. This provides a faster, but less reliable disk-buffer option.

Caution: Hazard of data loss!

If you change the value of reliable() option when there are messages in the disk-buffer file, the messages stored in the disk-buffer file will be lost.

truncate-size-ratio()
Type: number (for percentage) between 0 and 1
Default: 0.1 (10%)

Description: Limits the truncation of the disk-buffer file. Truncating the disk-buffer file can slow down disk I/O operations, but it saves disk space. As a result, syslog-ng PE only truncates the file if the possible disk gain is more than truncate-size-ratio() times disk-buf-size().

Caution:

One Identity recommends that you do not modify the value of the truncate-size-ratio() option unless you are fully aware of the potential performance implications.

Example: Examples for using disk-buffer()

In the following case, reliable disk-buffer() is used.

destination d_demo {
    network("127.0.0.1"
        port(3333)
        disk-buffer(
            mem-buf-size(10000)
            disk-buf-size(2000000)
            reliable(yes)
            dir("/tmp/disk-buffer")
        )
    );
};

In the following case normal disk-buffer() is used.

destination d_demo {
    network("127.0.0.1"
        port(3333)
        disk-buffer(
            mem-buf-length(10000)
            disk-buf-size(2000000)
            reliable(no)
            dir("/tmp/disk-buffer")
        )
    );
};
frac-digits()
Type: number
Default: 0

Description: The syslog-ng PE application can store fractions of a second in the timestamps according to the ISO8601 format. The frac-digits() parameter specifies the number of digits stored. The digits storing the fractions are padded by zeros if the original timestamp of the message specifies only seconds. Fractions can always be stored for the time the message was received.

NOTE: The syslog-ng PE application can add the fractions to non-ISO8601 timestamps as well.

NOTE: As syslog-ng PE is precise up to the microsecond, when the frac-digits() option is set to a value higher than 6, syslog-ng PE will truncate the fraction seconds in the timestamps after 6 digits.

hdfs-append-enabled()
Type: true | false
Default: false

Description: When hdfs-append-enabled is set to true, syslog-ng PE will append new data to the end of an already existing HDFS file. Note that in this case, archiving is automatically disabled, and syslog-ng PE will ignore the hdfs-archive-dir option.

When hdfs-append-enabled is set to false, the syslog-ng PE application always creates a new file if the previous has been closed. In that case, appending data to existing files is not supported.

When you choose to write data into an existing file, syslog-ng PE does not extend the filename with a UUID suffix because there is no need to open a new file (a new unique ID would mean opening a new file and writing data into that).

Caution:

Before enabling the hdfs-append-enabled option, ensure that your HDFS server supports the append operation and that it is enabled. Otherwise syslog-ng PE will not be able to append data into an existing file, resulting in an error log.

hdfs-archive-dir()
Type: string
Default: N/A

Description: The path where syslog-ng PE will move the closed log files. If syslog-ng PE cannot move the file for some reason (for example, syslog-ng PE cannot connect to the HDFS NameNode), the file remains at its original location. For example, hdfs-archive-dir("/usr/hdfs/archive/").

NOTE: When hdfs-append-enabled is set to true, archiving is automatically disabled, and syslog-ng PE will ignore the hdfs-archive-dir option.

hdfs-file()
Type: string
Default: N/A

Description: The path and name of the log file. For example, hdfs-file("/usr/hdfs/mylogfile.txt"). syslog-ng PE checks if the path to the logfile exists. If a directory does not exist syslog-ng PE automatically creates it.

hdfs-file() supports the usage of macros. This means that syslog-ng PE can create files on HDFS dynamically, using macros in the file (or directory) name.

NOTE: When a filename resolved from the macros contains a character that HDFS does not support, syslog-ng PE will not be able to create the file. Make sure that you use macros that do not contain unsupported characters.

Example: Using macros in filenames

In the following example, a /var/testdb_working_dir/$DAY-$HOUR.txt file will be created (with a UUID suffix):

destination d_hdfs_9bf3ff45341643c69bf46bfff940372a {
    hdfs(
        client-lib-dir(/hdfs-libs)
        hdfs-uri("hdfs://hdp2.syslog-ng.example:8020")
        hdfs_file("/var/testdb_working_dir/$DAY-$HOUR.txt")
    );
};

As an example, if it is the 31st day of the month and it is 12 o'clock, then the name of the file will be 31-12.txt.

hdfs-max-filename-length()
Type: number
Default: 255

Description: The maximum length of the filename. This filename (including the UUID that syslog-ng PE appends to it) cannot be longer than what the file system permits. If the filename is longer than the value of hdfs-max-filename-length, syslog-ng PE will automatically truncate the filename. For example, hdfs-max-filename-length("255").

hdfs-resources()
Type: string
Default: N/A

Description: The list of Hadoop resources to load, separated by semicolons. For example, hdfs-resources("/home/user/hadoop/core-site.xml;/home/user/hadoop/hdfs-site.xml").

hdfs-uri()
Type: string
Default: N/A

Description: The URI of the HDFS NameNode is in hdfs://IPaddress:port or hdfs://hostname:port format. When using MapR-FS, the URI of the MapR-FS NameNode is in maprfs://IPaddress or maprfs://hostname format, for example: maprfs://10.140.32.80. The IP address of the node can be IPv4 or IPv6. For example, hdfs-uri("hdfs://10.140.32.80:8020"). The IPv6 address must be enclosed in square brackets ([]) as specified by RFC 2732, for example, hdfs-uri("hdfs://[FEDC:BA98:7654:3210:FEDC:BA98:7654:3210]:8020").

jvm-options()
Type: list
Default: N/A

Description: Specify the Java Virtual Machine (JVM) settings of your Java destination from the syslog-ng PE configuration file.

For example:

jvm-options("-Xss1M -XX:+TraceClassLoading")

You can set this option only as a global option, by adding it to the options statement of the syslog-ng configuration file.

kerberos-keytab-file()
Type: string
Default: N/A

Description: The path to the Kerberos keytab file that you received from your Kerberos administrator. For example, kerberos-keytab-file("/opt/syslog-ng/etc/hdfs.headless.keytab"). This option is needed only if you want to authenticate using Kerberos in Hadoop. You also have to set the hdfs-option-kerberos-principal() option. For details on the using Kerberos authentication with the hdfs() destination, see Kerberos authentication with syslog-ng hdfs() destination.

destination d_hdfs {
    hdfs(
        client-lib-dir("/hdfs-libs/lib")
        hdfs-uri("hdfs://hdp-kerberos.syslog-ng.example:8020")
        kerberos-keytab-file("/opt/syslog-ng/etc/hdfs.headless.keytab")
        kerberos-principal("hdfs-hdpkerberos@MYREALM")
        hdfs-file("/var/hdfs/test.log")
    );
};

Available in syslog-ng PE version 7.0.3 and later.

kerberos-principal()
Type: string
Default: N/A

Description: The Kerberos principal you want to authenticate with. For example, kerberos-principal("hdfs-user@MYREALM"). This option is needed only if you want to authenticate using Kerberos in Hadoop. You also have to set the hdfs-option-kerberos-keytab-file() option. For details on the using Kerberos authentication with the hdfs() destination, see Kerberos authentication with syslog-ng hdfs() destination.

destination d_hdfs {
    hdfs(
        client-lib-dir("/hdfs-libs/lib")
        hdfs-uri("hdfs://hdp-kerberos.syslog-ng.example:8020")
        kerberos-keytab-file("/opt/syslog-ng/etc/hdfs.headless.keytab")
        kerberos-principal("hdfs-hdpkerberos@MYREALM")
        hdfs-file("/var/hdfs/test.log")
    );
};

Available in syslog-ng PE version 7.0.3 and later.

log-fifo-size()
Type: number
Default: Use global setting.

Description: The number of messages that the output queue can store.

on-error()
Accepted values:

drop-message|drop-property|fallback-to-string|

silently-drop-message|silently-drop-property|silently-fallback-to-string

Default: Use the global setting (which defaults to drop-message)

Description: Controls what happens when type-casting fails and syslog-ng PE cannot convert some data to the specified type. By default, syslog-ng PE drops the entire message and logs the error. Currently the value-pairs() option uses the settings of on-error().

  • drop-message: Drop the entire message and log an error message to the internal() source. This is the default behavior of syslog-ng PE.

  • drop-property: Omit the affected property (macro, template, or message-field) from the log message and log an error message to the internal() source.

  • fallback-to-string: Convert the property to string and log an error message to the internal() source.

  • silently-drop-message: Drop the entire message silently, without logging the error.

  • silently-drop-property: Omit the affected property (macro, template, or message-field) silently, without logging the error.

  • silently-fallback-to-string: Convert the property to string silently, without logging the error.

retries()
Type: number [of attempts]
Default: 3

Description: The number of times syslog-ng PE attempts to send a message to this destination. If syslog-ng PE could not send a message, it will try again until the number of attempts reaches retries(), then drops the message.

template()
Type: string
Default: A format conforming to the default logfile format.

Description: Specifies a template defining the logformat to be used in the destination. Macros are described in Macros of syslog-ng PE. Please note that for network destinations it might not be appropriate to change the template as it changes the on-wire format of the syslog protocol which might not be tolerated by stock syslog receivers (like syslogd or syslog-ng itself). For network destinations make sure the receiver can cope with the custom format defined.

throttle()
Type: number
Default: 0

Description: Sets the maximum number of messages sent to the destination per second. Use this output-rate-limiting functionality only when using the disk-buffer option as well to avoid the risk of losing messages. Specifying 0 or a lower value sets the output limit to unlimited.

time-reap()
Accepted values: number (seconds)
Default: 0 (disabled)

Description: The time to wait in seconds before an idle destination file is closed. Note that if hdfs-archive-dir option is set and time-reap expires, archiving is triggered for the affected file.

time-zone()
Type: name of the timezone, or the timezone offset
Default: unspecified

Description: Convert timestamps to the timezone specified by this option. If this option is not set, then the original timezone information in the message is used. Converting the timezone changes the values of all date-related macros derived from the timestamp, for example, HOUR. For the complete list of such macros, see Date-related macros.

The timezone can be specified as using the name of the (for example, time-zone("Europe/Budapest")), or as the timezone offset in +/-HH:MM format (for example, +01:00). On Linux and UNIX platforms, the valid timezone names are listed under the /usr/share/zoneinfo directory.

ts-format()
Type: rfc3164, bsd, rfc3339, iso
Default: rfc3164

Description: Override the global timestamp format (set in the global ts-format() parameter) for the specific destination. For details, see ts-format().

http: Posting messages over HTTP

Version 7.0.4 of syslog-ng PE can directly post log messages to web services using the HTTP protocol.

Limitations

The current implementation of the http() destination has the following limitations:

  • Only the PUT and the POST methods are supported.

  • HTTPS connections, as well as password-based and certificate-based authentication, are supported.

  • If the server returns a status code beginning with 4 (for example, 404) to the POST or PUT request, syslog-ng PE drops the message without attempting to resend it.

NOTE: Typically, only the central syslog-ng PE server uses this destination. For more information on the server mode, see Server mode.

Example: Client certificate authentication with HTTPS
destination d_https {
    http(
        [...]
        tls(
            ca-file("/<path-to-certificate-directory>/ca-crt.pem")
            ca-dir("/<path-to-certificate-directory>/")
            cert-file("/<path-to-certificate-directory>/server-crt.pem")
            key-file("/<path-to-certificate-directory>/server-key.pem")
            )
        [...]
    );
};
Declaration
destination d_http {
    http(
        url("<web-service-IP-or-hostname>")
        method("<HTTP-method>")
        user-agent("<USER-AGENT-message-value>")
        user("<username>")
        password("<password>")
    );
};
Example: Sending log data to a web service

The following example defines an http() destination.

destination d_http {
    http(
        url("http://127.0.0.1:8000")
        method("PUT")
        user-agent("syslog-ng User Agent")
        user("user")
        password("password")
        headers("HEADER1: header1", "HEADER2: header2")
        body("${ISODATE} ${MESSAGE}")
    );
};

log {
    source(s_file);
    destination(d_http);
    flags(flow-control);
};

Batch mode and load balancing

Starting with version 7.0.12, you can send multiple log messages in a single HTTP request if the destination HTTP server supports that.

Batch size

The batch-lines(), batch-lines(), and batch-timeout() options of the destination determine how many log messages syslog-ng PE sends in a batch. The batch-lines() option determines the maximum number of messages syslog-ng PE puts in a batch in. This can be limited based on size and time:

  • syslog-ng PE sends a batch every batch-timeout() milliseconds, even if the number of messages in the batch is less than batch-lines(). This ensures that the destination receives every message in a timely manner even if suddenly there are no more messages.

  • syslog-ng PE sends the batch if the total size of the messages in the batch reaches batch-bytes() bytes.

To increase the performance of the destination, increase the number of worker threads for the destination using the workers() option, or adjust the batch-bytes(), batch-lines(), batch-timeout() options.

Formatting the batch

By default, syslog-ng PE separates the log messages of the batch with a newline character. You can specify a different delimiter by using the delimiter() option.

If the target application or server requires a special beginning or ending to recognize batches, use the body-prefix() and body-suffix() options to add a beginning and ending to the batch. For example, you can use these options to create JSON-encoded arrays as POST payloads, which is required by a number of REST APIs. The body of a batch HTTP request looks like this:

value of body-prefix() option
log-line-1 (as formatted in the body() option)
log-line-2 (as formatted in the body() option)
....
log-line-n (the number of log lines is batch-lines(), or less if batch-timeout() has elapsed or the batch would be longer than batch-bytes())
value of body-suffix() option
Example: HTTP batch mode

The following destination sends log messages to an Elasticsearch server using the bulk API. A batch consists of 100 messages, or a maximum of 512 kilobytes, and is sent every 10 seconds (10000 milliseconds).

destination d_http {
    http(url("http://your-elasticsearch-server/_bulk")
        method("POST")
        batch-lines(100)
        batch-bytes(512Kb)
        batch-timeout(10000)
        headers("Content-Type: application/x-ndjson")
        body-suffix("\n")
        body('{ "index":{} }
             $(format-json --scope rfc5424 --key ISODATE)')
    );
};
Load balancing between multiple servers

Starting with version 7.0.12, you can specify multiple URLs, for example, url("site1" "site2"). In this case, syslog-ng PE sends log messages to the specified URLs in a load-balance fashion. This means that syslog-ng PE sends each message to only one URL. For example, you can use this to send the messages to a set of ingestion nodes or indexers of your SIEM solution if a single node cannot handle the load. Note that the order of the messages as they arrive on the servers can differ from the order syslog-ng PE has received them, so use load-balancing only if your server can use the timestamp from the messages. If the server uses the timestamp when it receives the messages, the order of the messages will be incorrect.

Caution:

If you set multiple URLs in the url() option, set the persist-name() option as well to avoid data loss.

Example: HTTP load balancing

The following destination sends log messages to an Elasticsearch server using the bulk API, to 3 different ingest nodes. Each node is assigned a separate worker thread. A batch consists of 100 messages, or a maximum of 512 kilobytes, and is sent every 10 seconds (10000 milliseconds).

destination d_http {
    http(url("http://your-elasticsearch-server/_bulk" "http://your-second-ingest-node/_bulk" "http://your-third-ingest-node/_bulk")
        method("POST")
        batch-lines(100)
        batch-bytes(512Kb)
        batch-timeout(10000)
        workers(3)
        headers("Content-Type: application/x-ndjson")
        body-suffix("\n")
        body('{ "index":{} }
             $(format-json --scope rfc5424 --key ISODATE)')
        persist-name("d_http-load-balance")
    );
};

If you are using load-balancing (that is, you have configured multiple servers in the url() option), increase the number of worker threads at least to the number of servers. For example, if you have set three URLs (url("site1" "site2" "site3")), set the workers() option to 3 or more.

Related Documents

The document was helpful.

Select Rating

I easily found the information I needed.

Select Rating