Starting with version
In order to use this destination, syslog-ng Premium Edition must run in server mode. Typically, only the central syslog-ng Premium Edition server uses this destination. For details on the server mode, see Server mode.
Note the following limitations when using the syslog-ng PEhdfs destination:
This destination is only supported on the Linux platforms that use the linux glibc2.11 installer, including: Red Hat ES 7, Ubuntu 14.04 (Trusty Tahr).
Since syslog-ng PE uses the official Java HDFS client, the hdfs destination has significant memory usage (about 400MB).
You cannot set when log messages are flushed. Hadoop performs this action automatically, depending on its configured block size, and the amount of data received. There is no way for the syslog-ng PE application to influence when the messages are actually written to disk. This means that syslog-ng PE cannot guarantee that a message sent to HDFS is actually written to disk. When using flow-control, syslog-ng PE acknowledges a message as written to disk when it passes the message to the HDFS client. This method is as reliable as your HDFS environment.
The log messages of the underlying client libraries are available in the internal() source of syslog-ng PE.
The hdfs destination has been tested with Hortonworks Data Platform.
@module mod-java @include "scl.conf" hdfs( client-lib-dir("/opt/syslog-ng/lib/syslog-ng/java-modules/:<path-to-preinstalled-hadoop-libraries>") hdfs-uri("hdfs://NameNode:8020") hdfs-file("<path-to-logfile>") );
The following example defines an hdfs destination using only the required parameters.
@module mod-java @include "scl.conf" destination d_hdfs { hdfs( client-lib-dir("/opt/syslog-ng/lib/syslog-ng/java-modules/:/opt/hadoop/libs") hdfs-uri("hdfs://10.140.32.80:8020") hdfs-file("/user/log/logfile.txt") ); };
To install the software required for the hdfs destination, see Prerequisites.
For details on how the hdfs destination works, see How syslog-ng PE interacts with HDFS.
For details on using MapR-FS, see Storing messages with MapR-FS.
For the list of options, see HDFS destination options.
If you delete all Java destinations from your configuration and reload syslog-ng, the JVM is not used anymore, but it is still running. If you want to stop JVM, stop syslog-ng and then start syslog-ng again.
The following describes how to send messages from syslog-ng PE to HDFS.
To send messages from syslog-ng PE to HDFS
If you want to use the Java-based modules of syslog-ng PE (for example, the Elasticsearch, HDFS, or Kafka destinations), download and install the Java Runtime Environment (JRE), 1.7 (or newer).
Download the Hadoop Distributed File System (HDFS) libraries (version 2.x) from http://hadoop.apache.org/releases.html.
Extract the HDFS libraries into a target directory (for example, /opt/hadoop/lib/), then execute the classpath command of the hadoop script: bin/hdfs classpath
Use the classpath that this command returns in the syslog-ng PE configuration file, in the client-lib-dir() option of the HDFS destination.
The syslog-ng PE application sends the log messages to the official HDFS client library, which forwards the data to the HDFS nodes. The way how syslog-ng PE interacts with HDFS is described in the following steps.
After syslog-ng PE is started and the first message arrives to the hdfs destination, the hdfs destination tries to connect to the HDFS NameNode. If the connection fails, syslog-ng PE will repeatedly attempt to connect again after the period set in time-reopen() expires.
syslog-ng PE checks if the path to the logfile exists. If a directory does not exist syslog-ng PE automatically creates it. syslog-ng PE creates the destination file (using the filename set in the syslog-ng PE configuration file, with a UUID suffix to make it unique, for example, /usr/hadoop/logfile.txt.3dc1c59e-ab3b-4b71-9e81-93db477ed9d9) and writes the message into the file. After the file is created, syslog-ng PE will write all incoming messages into the hdfs destination.
When the hdfs-append-enabled() option is set to true, syslog-ng PE will not assign a new UUID suffix to an existing file, because it is then possible to open a closed file and append data to that.
You cannot set when log messages are flushed. Hadoop performs this action automatically, depending on its configured block size, and the amount of data received. There is no way for the syslog-ng PE application to influence when the messages are actually written to disk. This means that syslog-ng PE cannot guarantee that a message sent to HDFS is actually written to disk. When using flow-control, syslog-ng PE acknowledges a message as written to disk when it passes the message to the HDFS client. This method is as reliable as your HDFS environment.
If the HDFS client returns an error, syslog-ng PE attempts to close the file, then opens a new file and repeats sending the message (trying to connect to HDFS and send the message), as set in the retries() parameter. If sending the message fails for retries() times, syslog-ng PE drops the message.
The syslog-ng PE application closes the destination file in the following cases:
syslog-ng PE is reloaded
syslog-ng PE is restarted
The HDFS client returns an error.
If the file is closed and you have set an archive directory, syslog-ng PE moves the file to this directory. If syslog-ng PE cannot move the file for some reason (for example, syslog-ng PE cannot connect to the HDFS NameNode), the file remains at its original location, syslog-ng PE will not try to move it again.
The syslog-ng PE application is also compatible with MapR File System (MapR-FS), starting from version 5.4, syslog-ng Premium Edition is MapR certified. MapR-FS provides better performance, reliability, efficiency, maintainability, and ease of use compared to the default Hadoop Distributed Files System (HDFS). To use MapR-FS with syslog-ng PE, complete the following steps:
Install MapR libraries. Instead of the official Apache HDFS libraries, MapR uses different libraries. The supported version is MapR 4.x.
Download the libraries from the Maven Repository and Artifacts for MapR or get it from an already existing MapR installation.
Install MapR. If you do not know how to install MapR, follow the instructions on the MapR website.
In a default MapR installation, the required libraries are installed in the following path: /opt/mapr/lib.
Enter the path where MapR was installed in the class-path option of the hdfs destination, for example:
class-path("/opt/mapr/lib/")
If the libraries were downloaded from the Maven Repository, the following additional libraries will be requiered. Note that the version numbers in the filenames can be different in the various Hadoop releases:commons-collections-3.2.1.jar, commons-logging-1.1.3.jar, hadoop-auth-2.5.1.jar, log4j-1.2.15.jar, slf4j-api-1.7.5.jar, commons-configuration-1.6.jar, guava-13.0.1.jar, hadoop-common-2.5.1.jar, maprfs-4.0.2-mapr.jar, slf4j-log4j12-1.7.5.jar, commons-lang-2.5.jar, hadoop-0.20.2-dev-core.jar, json-20080701.jar, protobuf-java-2.5.0.jar, zookeeper-3.4.5-mapr-1406.jar.
Configure the hdfs destination in syslog-ng PE.
The following example defines an hdfs destination for MapR-FS using only the required parameters.
@module mod-java @include "scl.conf" destination d_mapr { hdfs( client-lib-dir("/opt/syslog-ng/lib/syslog-ng/java-modules/:/opt/mapr/lib/") hdfs-uri("maprfs://10.140.32.80") hdfs-file("/user/log/logfile.txt") ); };
© 2022 One Identity LLC. ALL RIGHTS RESERVED. Feedback Terms of Use Privacy