Version 3.10 and later supports Kerberos authentication to authenticate the connection to your Hadoop cluster. syslog-ng OSE assumes that you already have a Hadoop and Kerberos infrastructure.
NOTE: If you configure Kerberos authentication for a hdfs() destination, it affects all hdfs() destinations. Kerberos and non-Kerberos hdfs() destinations cannot be mixed in a syslog-ng OSE configuration. This means that if one hdfs() destination uses Kerberos authentication, you have to configure all other hdfs() destinations to use Kerberos authentication too.
Failing to do so results in non-Kerberos hdfs() destinations being unable to authenticate to the HDFS server.
NOTE: If you want to configure your hdfs() destination to stop using Kerberos authentication, namely, to remove Kerberos-related options from the hdfs() destination configuration, make sure to restart syslog-ng OSE for the changes to take effect.
Prerequisites:
-
You have configured your Hadoop infrastructure to use Kerberos authentication.
-
You have a keytab file and a principal for the host running syslog-ng OSE. For details, see the Kerberos documentation.
-
You have installed and configured the Kerberos client packages on the host running syslog-ng OSE. (That is, Kerberos authentication works for the host, for example, from the command line using the kinit user@REALM -k -t <keytab_file> command.)
destination d_hdfs {
hdfs(client-lib-dir("/hdfs-libs/lib")
hdfs-uri("hdfs://hdp-kerberos.syslog-ng.example:8020")
kerberos-keytab-file("/opt/syslog-ng/etc/hdfs.headless.keytab")
kerberos-principal("hdfs-hdpkerberos@MYREALM")
hdfs-file("/var/hdfs/test.log"));
};
The hdfs destination stores the log messages in files on the Hadoop Distributed File System (HDFS). The hdfs destination has the following options.
The following options are required: hdfs-file(), hdfs-uri(). Note that to use hdfs, you must add the following line to the beginning of your syslog-ng OSE configuration:
@include "scl.conf"
client-lib-dir()
Type: |
string |
Default: |
The syslog-ng OSE module directory: /opt/syslog-ng/lib/syslog-ng/java-modules/ |
Description: The list of the paths where the required Java classes are located. For example, class-path("/opt/syslog-ng/lib/syslog-ng/java-modules/:/opt/my-java-libraries/libs/"). If you set this option multiple times in your syslog-ng OSE configuration (for example, because you have multiple Java-based destinations), syslog-ng OSE will merge every available paths to a single list.
For the hdfs destination, include the path to the directory where you copied the required libraries (see Prerequisites), for example, client-lib-dir("/opt/syslog-ng/lib/syslog-ng/java-modules/:/opt/hadoop/libs/").
disk-buffer()
Description: This option enables putting outgoing messages into the disk buffer of the destination to avoid message loss in case of a system failure on the destination side. It has the following options:
reliable() |
Type: |
yes|no |
Default: |
no |
Description: If set to yes, syslog-ng OSE cannot lose logs in case of reload/restart, unreachable destination or syslog-ng OSE crash. This solution provides a slower, but reliable disk-buffer option. It is created and initialized at startup and gradually grows as new messages arrive. If set to no, the normal disk-buffer will be used. This provides a faster, but less reliable disk-buffer option.
|
Caution:
Hazard of data loss! If you change the value of reliable() option when there are messages in the disk-buffer, the messages stored in the disk-buffer will be lost. | |
compaction() |
Type: |
yes|no |
Default: |
no |
Description: If set to yes, syslog-ng OSE prunes the unused space in the LogMessage representation, making the disk queue size smaller at the cost of some CPU time. Setting the compaction() argument to yes is recommended when numerous name-value pairs are unset during processing, or when the same names are set multiple times. |
NOTE: Simply unsetting these name-value pairs by using the unset() rewrite operation is not enough, as due to performance reasons that help when syslog-ng is CPU bound, the internal representation of a LogMessage will not release the memory associated with these name-value pairs. In some cases, however, the size of this overhead becomes significant (the raw message size can grow up to four times its original size), which unnecessarily increases the disk queue file size. For these cases, the compaction will drop "unset" values, making the LogMessage representation smaller at the cost of some CPU time required to perform compaction.
dir() |
Type: |
string |
Default: |
N/A |
Description: Defines the folder where the disk-buffer files are stored.
|
Caution:
When creating a new dir() option for a disk buffer, or modifying an existing one, make sure you delete the persist file.
syslog-ng OSE creates disk-buffer files based on the path recorded in the persist file. Therefore, if the persist file is not deleted after modifying the dir() option, then following a restart, syslog-ng OSE will look for or create disk-buffer files in their old location. To ensure that syslog-ng OSE uses the new dir() setting, the persist file must not contain any information about the destinations which the disk-buffer file in question belongs to. | |
disk-buf-size() |
Type: |
number (bytes) |
Default: |
|
Description: This is a required option. The maximum size of the disk-buffer in bytes. The minimum value is 1048576 bytes. If you set a smaller value, the minimum value will be used automatically. It replaces the old log-disk-fifo-size() option. |
mem-buf-length() |
Type: |
number (messages) |
Default: |
10000 |
Description: Use this option if the option reliable() is set to no. This option contains the number of messages stored in overflow queue. It replaces the old log-fifo-size() option. It inherits the value of the global log-fifo-size() option if provided. If it is not provided, the default value is 10000 messages. Note that this option will be ignored if the option reliable() is set to yes. |
mem-buf-size() |
Type: |
number (bytes) |
Default: |
163840000 |
Description: Use this option if the option reliable() is set to yes. This option contains the size of the messages in bytes that is used in the memory part of the disk buffer. It replaces the old log-fifo-size() option. It does not inherit the value of the global log-fifo-size() option, even if it is provided. Note that this option will be ignored if the option reliable() is set to no. |
qout-size() |
Type: |
number (messages) |
Default: |
64 |
Description: The number of messages stored in the output buffer of the destination. Note that if you change the value of this option and the disk-buffer already exists, the change will take effect when the disk-buffer becomes empty. |
Options reliable() and disk-buf-size() are required options.
Example: Examples for using disk-buffer()
In the following case reliable disk-buffer() is used.
destination d_demo {
network(
"127.0.0.1"
port(3333)
disk-buffer(
mem-buf-size(10000)
disk-buf-size(2000000)
reliable(yes)
dir("/tmp/disk-buffer")
)
);
};
In the following case normal disk-buffer() is used.
destination d_demo {
network(
"127.0.0.1"
port(3333)
disk-buffer(
mem-buf-length(10000)
disk-buf-size(2000000)
reliable(no)
dir("/tmp/disk-buffer")
)
);
};
frac-digits()
Description: The syslog-ng application can store fractions of a second in the timestamps according to the ISO8601 format. The frac-digits() parameter specifies the number of digits stored. The digits storing the fractions are padded by zeros if the original timestamp of the message specifies only seconds. Fractions can always be stored for the time the message was received.
NOTE: The syslog-ng OSE application can add the fractions to non-ISO8601 timestamps as well.
NOTE: As syslog-ng OSE is precise up to the microsecond, when the frac-digits() option is set to a value higher than 6, syslog-ng OSE will truncate the fraction seconds in the timestamps after 6 digits.
hdfs-append-enabled()
Type: |
true | false |
Default: |
false |
Description: When hdfs-append-enabled is set to true, syslog-ng OSE will append new data to the end of an already existing HDFS file. Note that in this case, archiving is automatically disabled, and syslog-ng OSE will ignore the hdfs-archive-dir option.
When hdfs-append-enabled is set to false, the syslog-ng OSE application always creates a new file if the previous has been closed. In that case, appending data to existing files is not supported.
When you choose to write data into an existing file, syslog-ng OSE does not extend the filename with a UUID suffix because there is no need to open a new file (a new unique ID would mean opening a new file and writing data into that).
|
Caution:
Before enabling the hdfs-append-enabled option, ensure that your HDFS server supports the append operation and that it is enabled. Otherwise syslog-ng OSE will not be able to append data into an existing file, resulting in an error log. |
hdfs-archive-dir()
Type: |
string |
Default: |
N/A |
Description: The path where syslog-ng OSE will move the closed log files. If syslog-ng OSE cannot move the file for some reason (for example, syslog-ng OSE cannot connect to the HDFS NameNode), the file remains at its original location. For example, hdfs-archive-dir("/usr/hdfs/archive/").
NOTE: When hdfs-append-enabled is set to true, archiving is automatically disabled, and syslog-ng OSE will ignore the hdfs-archive-dir option.
hdfs-file()
Type: |
string |
Default: |
N/A |
Description: The path and name of the log file. For example, hdfs-file("/usr/hdfs/mylogfile.txt"). syslog-ng OSE checks if the path to the logfile exists. If a directory does not exist syslog-ng OSE automatically creates it.
hdfs-file() supports the usage of macros. This means that syslog-ng OSE can create files on HDFS dynamically, using macros in the file (or directory) name.
NOTE: When a filename resolved from the macros contains a character that HDFS does not support, syslog-ng OSE will not be able to create the file. Make sure that you use macros that do not contain unsupported characters.
Example: Using macros in filenames
In the following example, a /var/testdb_working_dir/$DAY-$HOUR.txt file will be created (with a UUID suffix):
destination d_hdfs_9bf3ff45341643c69bf46bfff940372a {
hdfs(client-lib-dir(/hdfs-libs)
hdfs-uri("hdfs://hdp2.syslog-ng.example:8020")
hdfs-file("/var/testdb_working_dir/$DAY-$HOUR.txt"));
};
As an example, if it is the 31st day of the month and it is 12 o'clock, then the name of the file will be 31-12.txt.
hdfs-max-filename-length()
Type: |
number |
Default: |
255 |
Description: The maximum length of the filename. This filename (including the UUID that syslog-ng OSE appends to it) cannot be longer than what the file system permits. If the filename is longer than the value of hdfs-max-filename-length, syslog-ng OSE will automatically truncate the filename. For example, hdfs-max-filename-length("255").
hdfs-resources()
Type: |
string |
Default: |
N/A |
Description: The list of Hadoop resources to load, separated by semicolons. For example, hdfs-resources("/home/user/hadoop/core-site.xml;/home/user/hadoop/hdfs-site.xml").
hdfs-uri()
Type: |
string |
Default: |
N/A |
Description: The URI of the HDFS NameNode is in hdfs://IPaddress:port or hdfs://hostname:port format. When using MapR-FS, the URI of the MapR-FS NameNode is in maprfs://IPaddress or maprfs://hostname format, for example: maprfs://10.140.32.80. The IP address of the node can be IPv4 or IPv6. For example, hdfs-uri("hdfs://10.140.32.80:8020"). The IPv6 address must be enclosed in square brackets ([]) as specified by RFC 2732, for example, hdfs-uri("hdfs://[FEDC:BA98:7654:3210:FEDC:BA98:7654:3210]:8020").
hook-commands()
Description: This option makes it possible to execute external programs when the relevant driver is initialized or torn down. The hook-commands() can be used with all source and destination drivers with the exception of the usertty() and internal() drivers.
NOTE: The syslog-ng OSE application must be able to start and restart the external program, and have the necessary permissions to do so. For example, if your host is running AppArmor or SELinux, you might have to modify your AppArmor or SELinux configuration to enable syslog-ng OSE to execute external applications.
Using the hook-commands() when syslog-ng OSE starts or stops
To execute an external program when syslog-ng OSE starts or stops, use the following options:
startup() |
Type: |
string |
Default: |
N/A |
Description: Defines the external program that is executed as syslog-ng OSE starts. |
shutdown() |
Type: |
string |
Default: |
N/A |
Description: Defines the external program that is executed as syslog-ng OSE stops. |
Using the hook-commands() when syslog-ng OSE reloads
To execute an external program when the syslog-ng OSE configuration is initiated or torn down, for example, on startup/shutdown or during a syslog-ng OSE reload, use the following options:
setup() |
Type: |
string |
Default: |
N/A |
Description: Defines an external program that is executed when the syslog-ng OSE configuration is initiated, for example, on startup or during a syslog-ng OSE reload. |
teardown() |
Type: |
string |
Default: |
N/A |
Description: Defines an external program that is executed when the syslog-ng OSE configuration is stopped or torn down, for example, on shutdown or during a syslog-ng OSE reload. |
Example: Using the hook-commands() with a network source
In the following example, the hook-commands() is used with the network() driver and it opens an iptables port automatically as syslog-ng OSE is started/stopped.
The assumption in this example is that the LOGCHAIN chain is part of a larger ruleset that routes traffic to it. Whenever the syslog-ng OSE created rule is there, packets can flow, otherwise the port is closed.
source {
network(transport(udp)
hook-commands(
startup("iptables -I LOGCHAIN 1 -p udp --dport 514 -j ACCEPT")
shutdown("iptables -D LOGCHAIN 1")
)
);
};
jvm-options()
Description: Specify the Java Virtual Machine (JVM) settings of your Java destination from the syslog-ng OSE configuration file.
For example:
jvm-options("-Xss1M -XX:+TraceClassLoading")
You can set this option only as a global option, by adding it to the options statement of the syslog-ng configuration file.
kerberos-keytab-file()
Type: |
string |
Default: |
N/A |
Description: The path to the Kerberos keytab file that you received from your Kerberos administrator. For example, kerberos-keytab-file("/opt/syslog-ng/etc/hdfs.headless.keytab"). This option is needed only if you want to authenticate using Kerberos in Hadoop. You also have to set the hdfs-option-kerberos-principal() option. For details on the using Kerberos authentication with the hdfs() destination, see Kerberos authentication with syslog-ng hdfs() destination.
destination d_hdfs {
hdfs(client-lib-dir("/hdfs-libs/lib")
hdfs-uri("hdfs://hdp-kerberos.syslog-ng.example:8020")
kerberos-keytab-file("/opt/syslog-ng/etc/hdfs.headless.keytab")
kerberos-principal("hdfs-hdpkerberos@MYREALM")
hdfs-file("/var/hdfs/test.log"));
};
Available in syslog-ng OSE version 3.10 and later.
kerberos-principal()
Type: |
string |
Default: |
N/A |
Description: The Kerberos principal you want to authenticate with. For example, kerberos-principal("hdfs-user@MYREALM"). This option is needed only if you want to authenticate using Kerberos in Hadoop. You also have to set the hdfs-option-kerberos-keytab-file() option. For details on the using Kerberos authentication with the hdfs() destination, see Kerberos authentication with syslog-ng hdfs() destination.
destination d_hdfs {
hdfs(client-lib-dir("/hdfs-libs/lib")
hdfs-uri("hdfs://hdp-kerberos.syslog-ng.example:8020")
kerberos-keytab-file("/opt/syslog-ng/etc/hdfs.headless.keytab")
kerberos-principal("hdfs-hdpkerberos@MYREALM")
hdfs-file("/var/hdfs/test.log"));
};
Available in syslog-ng OSE version 3.10 and later.
log-fifo-size()
Type: |
number |
Default: |
Use global setting. |
Description: The number of messages that the output queue can store.
on-error()
Accepted values: |
drop-message|drop-property|fallback-to-string|
silently-drop-message|silently-drop-property|silently-fallback-to-string |
Default: |
Use the global setting (which defaults to drop-message) |
Description: Controls what happens when type-casting fails and syslog-ng OSE cannot convert some data to the specified type. By default, syslog-ng OSE drops the entire message and logs the error. Currently the value-pairs() option uses the settings of on-error().
-
drop-message: Drop the entire message and log an error message to the internal() source. This is the default behavior of syslog-ng OSE.
-
drop-property: Omit the affected property (macro, template, or message-field) from the log message and log an error message to the internal() source.
-
fallback-to-string: Convert the property to string and log an error message to the internal() source.
-
silently-drop-message: Drop the entire message silently, without logging the error.
-
silently-drop-property: Omit the affected property (macro, template, or message-field) silently, without logging the error.
-
silently-fallback-to-string: Convert the property to string silently, without logging the error.
retries()
Type: |
number (of attempts) |
Default: |
3 |
Description: The number of times syslog-ng OSE attempts to send a message to this destination. If syslog-ng OSE could not send a message, it will try again until the number of attempts reaches retries, then drops the message.
template()
Type: |
string |
Default: |
A format conforming to the default logfile format. |
Description: Specifies a template defining the logformat to be used in the destination. Macros are described in Macros of syslog-ng OSE. Please note that for network destinations it might not be appropriate to change the template as it changes the on-wire format of the syslog protocol which might not be tolerated by stock syslog receivers (like syslogd or syslog-ng itself). For network destinations make sure the receiver can cope with the custom format defined.
throttle()
Description: Sets the maximum number of messages sent to the destination per second. Use this output-rate-limiting functionality only when using disk-buffer as well to avoid the risk of losing messages. Specifying 0 or a lower value sets the output limit to unlimited.
time-reap()
Accepted values: |
number (seconds) |
Default: |
0 (disabled) |
Description: The time to wait in seconds before an idle destination file is closed. Note that if hdfs-archive-dir option is set and time-reap expires, archiving is triggered for the affected file.
time-zone()
Type: |
name of the timezone, or the timezone offset |
Default: |
unspecified |
Description: Convert timestamps to the timezone specified by this option. If this option is not set, then the original timezone information in the message is used. Converting the timezone changes the values of all date-related macros derived from the timestamp, for example, HOUR. For the complete list of such macros, see Date-related macros.
The timezone can be specified by using the name, for example, time-zone("Europe/Budapest")), or as the timezone offset in +/-HH:MM format, for example, +01:00). On Linux and UNIX platforms, the valid timezone names are listed under the /usr/share/zoneinfo directory.
ts-format()
Type: |
rfc3164, bsd, rfc3339, iso |
Default: |
rfc3164 |
Description: Override the global timestamp format (set in the global ts-format() parameter) for the specific destination. For details, see ts-format().
NOTE: This option applies only to file and file-like destinations. Destinations that use specific protocols (for example, network(), or syslog()) ignore this option. For protocol-like destinations, use a template locally in the destination, or use the proto-template option.
Version 3.7 of syslog-ng OSE can directly post log messages to web services using the HTTP protocol. Error and status messages received from the HTTP server are forwarded to the internal logs of syslog-ng OSE. The current implementation has the following limitations:
-
This destination is only supported on the Linux platform.
-
Only HTTP connections are supported, HTTPS is not.
-
This destination requires Java. For an http destination that does not use Java, see http: Posting messages over HTTP without Java.
Declaration:
java(
class-path("/syslog-ng/install_dir/lib/syslog-ng/java-modules/*.jar")
class-name("org.syslog_ng.http.HTTPDestination")
option("url", "http://<server-address>:<port-number>")
);
Example: Sending log data to a web service
The following example defines an http destination.
destination d_http {
java(
class-path("/syslog-ng/install_dir/lib/syslog-ng/java-modules/*.jar")
class-name("org.syslog_ng.http.HTTPDestination")
option("url", "http://192.168.1.1:80")
);
};
log
{ source(s_file); destination(d_http); flags(flow-control); };
NOTE: If you delete all Java destinations from your configuration and reload syslog-ng, the JVM is not used anymore, but it is still running. If you want to stop JVM, stop syslog-ng and then start syslog-ng again.
The http destination of syslog-ng OSE can directly post log messages to web services using the HTTP protocol. The http destination has the following options. Some of these options are directly used by the Java code underlying the http destination, therefore these options must be specified in the following format:
option("<option-name>", "<option-value>")
For example, option("url", "http://<server-address>:<port-number>"). The exact format to use is indicated in the description of the option.
Required options
The following options are required: url().
ca-dir()
Accepted values: |
Directory name |
Default: |
none |
Description: The name of a directory that contains a set of trusted CA certificates in PEM format. The CA certificate files have to be named after the 32-bit hash of the subject's name. This naming can be created using the c_rehash utility in openssl. For an example, see Configuring TLS on the syslog-ng clients. The syslog-ng OSE application uses the CA certificates in this directory to validate the certificate of the peer.
This option can be used together with the optional ca-file() option.
ca-file()
Accepted values: |
File name |
Default: |
empty |
Description: Optional. The name of a file that contains a set of trusted CA certificates in PEM format. The syslog-ng OSE application uses the CA certificates in this file to validate the certificate of the peer.
Example format in configuration:
ca-file("/etc/pki/tls/certs/ca-bundle.crt")
NOTE: The ca-file() option can be used together with the ca-dir() option, and it is relevant when peer-verify() is set to other than no or optional-untrusted.
class-name()
Type: |
string |
Default: |
N/A |
Description: The name of the class (including the name of the package) that includes the destination driver to use.
For the http destination, use this option as class-name("org.syslog_ng.http.HTTPDestination").
client-lib-dir()
Type: |
string |
Default: |
The syslog-ng OSE module directory: /opt/syslog-ng/lib/syslog-ng/java-modules/ |
Description: The list of the paths where the required Java classes are located. For example, class-path("/opt/syslog-ng/lib/syslog-ng/java-modules/:/opt/my-java-libraries/libs/"). If you set this option multiple times in your syslog-ng OSE configuration (for example, because you have multiple Java-based destinations), syslog-ng OSE will merge every available paths to a single list.
For the http destination, include the path to the java modules of syslog-ng OSE, for example, class-path("/syslog-ng/install_dir/lib/syslog-ng/java-modules/*.jar").
hook-commands()
Description: This option makes it possible to execute external programs when the relevant driver is initialized or torn down. The hook-commands() can be used with all source and destination drivers with the exception of the usertty() and internal() drivers.
NOTE: The syslog-ng OSE application must be able to start and restart the external program, and have the necessary permissions to do so. For example, if your host is running AppArmor or SELinux, you might have to modify your AppArmor or SELinux configuration to enable syslog-ng OSE to execute external applications.
Using the hook-commands() when syslog-ng OSE starts or stops
To execute an external program when syslog-ng OSE starts or stops, use the following options:
startup() |
Type: |
string |
Default: |
N/A |
Description: Defines the external program that is executed as syslog-ng OSE starts. |
shutdown() |
Type: |
string |
Default: |
N/A |
Description: Defines the external program that is executed as syslog-ng OSE stops. |
Using the hook-commands() when syslog-ng OSE reloads
To execute an external program when the syslog-ng OSE configuration is initiated or torn down, for example, on startup/shutdown or during a syslog-ng OSE reload, use the following options:
setup() |
Type: |
string |
Default: |
N/A |
Description: Defines an external program that is executed when the syslog-ng OSE configuration is initiated, for example, on startup or during a syslog-ng OSE reload. |
teardown() |
Type: |
string |
Default: |
N/A |
Description: Defines an external program that is executed when the syslog-ng OSE configuration is stopped or torn down, for example, on shutdown or during a syslog-ng OSE reload. |
Example: Using the hook-commands() with a network source
In the following example, the hook-commands() is used with the network() driver and it opens an iptables port automatically as syslog-ng OSE is started/stopped.
The assumption in this example is that the LOGCHAIN chain is part of a larger ruleset that routes traffic to it. Whenever the syslog-ng OSE created rule is there, packets can flow, otherwise the port is closed.
source {
network(transport(udp)
hook-commands(
startup("iptables -I LOGCHAIN 1 -p udp --dport 514 -j ACCEPT")
shutdown("iptables -D LOGCHAIN 1")
)
);
};
jvm-options()
Description: Specify the Java Virtual Machine (JVM) settings of your Java destination from the syslog-ng OSE configuration file.
For example:
jvm-options("-Xss1M -XX:+TraceClassLoading")
You can set this option only as a global option, by adding it to the options statement of the syslog-ng configuration file.
log-fifo-size()
Type: |
number |
Default: |
Use global setting. |
Description: The number of messages that the output queue can store.
method()
Type: |
DELETE | HEAD | GET | OPTIONS | POST | PUT | TRACE |
Default: |
PUT |
Description: Specifies the HTTP method to use when sending the message to the server. Available in syslog-ng OSE version 3.7.2 and newer.
retries()
Type: |
number (of attempts) |
Default: |
3 |
Description: The number of times syslog-ng OSE attempts to send a message to this destination. If syslog-ng OSE could not send a message, it will try again until the number of attempts reaches retries, then drops the message.
template()
Type: |
string |
Default: |
A format conforming to the default logfile format. |
Description: Specifies a template defining the logformat to be used in the destination. Macros are described in Macros of syslog-ng OSE. Please note that for network destinations it might not be appropriate to change the template as it changes the on-wire format of the syslog protocol which might not be tolerated by stock syslog receivers (like syslogd or syslog-ng itself). For network destinations make sure the receiver can cope with the custom format defined.
throttle()
Description: Sets the maximum number of messages sent to the destination per second. Use this output-rate-limiting functionality only when using disk-buffer as well to avoid the risk of losing messages. Specifying 0 or a lower value sets the output limit to unlimited.
url()
Description: Specifies the hostname or IP address and optionally the port number of the web service that can receive log data via HTTP. Use a colon (:) after the address to specify the port number of the server. You can also use macros, templates, and template functions in the URL, for example: http://host.example.com:8080/${MACRO1}/${MACRO2}/script")