Converse agora com nosso suporte
Chat com o suporte

syslog-ng Store Box 6.10.0 - Administration Guide

Preface Introduction The concepts of SSB The Welcome Wizard and the first login Basic settings User management and access control Managing SSB Configuring message sources Storing messages on SSB Forwarding messages from SSB Log paths: routing and processing messages Configuring syslog-ng options Searching log messages Searching the internal messages of SSB Classifying messages with pattern databases The SSB RPC API Monitoring SSB Troubleshooting SSB Security checklist for configuring SSB Glossary

Monitoring CPU load averages

SNMP object: UCD-SNMP-MIB::laLoad

Community (v2c) /

Context (v3)

Data and system

CPU load averages (or system load averages) is the average load of syslog-ng Store Box(SSB)'s CPUs and the size of the task queue, during the past 1, 5, and 15 minutes, respectively.

If the load is constantly equal to or higher than the number of CPUs in your appliance, fine-tune your configuration or purchase more SSB appliances. For assistance, contact our Support Team.

If you query UCD-SNMP-MIB::laTable, the table of returned values will contain the returned values you would get when querying UCD-SNMP-MIB::laLoad, but in a more structured, easy-to-read format.

For which systems and configurations is it applicable? Applicable for all configurations and systems.
Value change frequency Its value is continuously changing.
Related issues and issue indicators Load too high (see Monitoring SSB's CPU).

Solution:

  • Decrease load.
  • Purchase a new SSB appliance.

Monitoring CPU usage

User CPU time:
SNMP object: UCD-SNMP-MIB::ssCpuUser

Community (v2c) /

Context (v3)

Data and system

If processor is not in idle (for example, there is live log traffic or report generation), it is quite normal that the majority of the CPU time is spent on running user space processes.

For which systems and configurations is it applicable? Applicable for all configurations and systems.
Value change frequency Its value is continuously changing.
Related issues and issue indicators If the returned value is too high, it indicates a highly loaded CPU on user space.

Solution:

  • Decrease load.
  • Purchase a new syslog-ng Store Box(SSB) appliance

  • Reconsider your configuration settings.
System CPU time:
SNMP object: UCD-SNMP-MIB::ssCpuSystem

Community (v2c) /

Context (v3)

Data and system

The amount of time spent in the kernel should be as low as possible. Ideally, around 0.5% of the time given to the different processes is spent in the kernel. This number can peak much higher, especially when there are a lot of I/O activities.

For which systems and configurations is it applicable? Applicable for all configurations and systems.
Value change frequency Its value is continuously changing.
Related issues and issue indicators If the returned value is too high, it indicates high kernel-intensive operations running on the CPU.

Solution:

  • Decrease load.
  • Purchase a new SSB appliance

  • Reconsider your configuration settings.
Idle CPU time:
SNMP object: UCD-SNMP-MIB::ssCpuIdle

Community (v2c) /

Context (v3)

Data and system

The total of the user CPU time percentage and the idle CPU percentage should be close to 100%. If the CPU spends a lot more time in other states, it is worth investigating the root cause, because it can indicate issues.

For which systems and configurations is it applicable? Applicable for all configurations and systems.
Value change frequency Its value is continuously changing.
Monitoring the details of your CPU usage:

To fine-tune monitoring your CPU usage, you can use the following values. It is also possible to generate a chart from these values.

Raw CPU user time UCD-SNMP-MIB::ssCpuRawUser
Raw CPU nice time UCD-SNMP-MIB::ssCpuRawNice
Raw CPU system time UCD-SNMP-MIB::ssCpuRawSystem
Raw CPU idle time UCD-SNMP-MIB::ssCpuRawIdle
Raw CPU wait time UCD-SNMP-MIB::ssCpuRawWait
Raw CPU kernel time UCD-SNMP-MIB::ssCpuRawKernel
Raw CPU interrupt time UCD-SNMP-MIB::ssCpuRawInterrupt
Raw CPU Soft IRQ time UCD-SNMP-MIB::ssCpuRawSoftIRQ
Raw CPU steal time UCD-SNMP-MIB::ssCpuRawSteal
Raw CPU guest time UCD-SNMP-MIB::ssCpuRawGuest
Raw CPU guest nice time UCD-SNMP-MIB::ssCpuRawGuestNice

Community (v2c) /

Context (v3)

Data and system for all of the above

Monitoring SSB's I/O

Disk I/O per partition
SNMP object: UCD-DISKIO-MIB::diskIOTable

Community (v2c) /

Context (v3)

Data and system
  • sda

    If the 15-minute load (for details, see Monitoring CPU load averages) is getting close to 90%, your system does not have enough resources and you probably need to purchase more syslog-ng Store Box(SSB) appliances. For assistance, contact our Support Team.

  • sdb

    NOTE: This is only available on SSB T1 appliances.

    If the 15-minute load (for details, see Monitoring CPU load averages) is getting close to 90%, your system does not have enough resources and you probably need to purchase more SSB appliances. For assistance, contact our Support Team.

For which systems and configurations is it applicable? Applicable for all configurations and systems.
Value change frequency Its value changes quite often, depending on the I/O load and its type.
Related issues and issue indicators When I/O load is too high, the system slows down.

Solution:

  • Reconsider your configuration settings.
  • Purchase a new SSB appliance.

  • For technical assistance, contact our Support Team.
Interfaces I/O by interface name
SNMP object: RFC1213-MIB::ifTable

Community (v2c) /

Context (v3)

Data and system

The following interfaces can be monitored (the type of traffic that can affect the load):

  • eth0 - external (network, redundant HA, next hop monitoring)

  • eth1 - management (network, redundant HA, next hop monitoring)

  • eth2 - internal (redundant, next hop monitoring)

  • eth3 - HA

If the load on an interface seems to be too high, check whether you have configured SSB in a way that affects that node. For example, if you do not use a management interface, the load on the external interface can be higher. Or, configuring next hop monitoring can also increase the load on an interface.

RFC1213-MIB:ifTable
For which systems and configurations is it applicable? Applicable for all configurations and systems.
Value change frequency Its value is continuously changing, depending on incoming logs and DRBD sync.
Related issues and issue indicators I/O load may become too high on network interfaces, which may result in log loss, slow sync, and HA in degraded mode.

Solution:

  • Reconsider your configuration settings.
  • Purchase a new SSB appliance.

  • For technical assistance, contact our Support Team.
RFC1213-MIB:ifTable - ETH 0, ETH3
For which systems and configurations is it applicable? Applicable for all configurations and systems.
Value change frequency Its value is continuously changing, depending on the number of incoming logs.
Related issues and issue indicators If the I/O load is too high, the network will not handle it, which may result in log loss.

Caution:

Hazard of data loss If the I/O load becomes too high for the network to handle, it may cause log loss. To avoid log loss, reconsider your configuration settings, Alternatively, reconsider your configuration settings, upgrading the capacity of your SSB appliance, purchasing more SSB appliances, or contact our Support Team.

Solution:

RFC1213-MIB:ifTable - ETH3
For which systems and configurations is it applicable? Only applicable for HA clusters.
Value change frequency Its value is continuously changing, depending on the number of incoming logs.
Related issues and issue indicators Network traffic load too high for the NIC to handle.
(It rarely ever happens.)

Solution:

Monitoring SSB statistics

SSB's version number
SNMP object: SSB-SNMP-MIB::ssbFirmwareVersion
Type:

String

Community (v2c) /

Context (v3)

Data
Short description: Current version of the syslog-ng Store Box(SSB).

Description: The current version number of SSB. This always changes after a successful upgrade.

Number of session files on SSB
SNMP object: SSB-SNMP-MIB::ssbHTTPSessions
Type: Integer32

Community (v2c) /

Context (v3)

Data
Short description: Number of recently active HTTP-based connections to SSB.

Description: The number of session files on SSB. These are generated as a result of the following events:

  • Accessing the web user interface of SSB.
  • Accessing a remote logspace.
  • Performing an RPC API call.
For which systems and configurations is it applicable? Applicable for all configurations and systems.
Value change frequency Its value is continuously changing, depending on the number of active connections.
Related issues and issue indicators If the returned value changes too often within a short period of time, it can indicate a brute force attack.

Solution:

  • Cooperate with your network administrator to fend off the external brute force attack.

Number of core files on SSB
SNMP object: SSB-SNMP-MIB::ssbCoreFiles
Type: Integer (number of)

Community (v2c) /

Context (v3)

Data
Short description: The number of core files in SSB's core firmware.

Description: If the value of this parameter is larger than 0, contact our Support Team.

For which systems and configurations is it applicable? Applicable for all configurations and systems, but unless a core file is generated, its returned value is 0.
Value change frequency Its value does not change often, only when a core file is generated.
Related issues and issue indicators Even a single core file indicates an issue. When more than one of them appear, it indicates a more serious issue.

Solution:

  • Check the state of syslog-ng/indexer.

  • Restart your syslog-ng application.

  • For technical assistance, contact our Support Team.
Available free space on SSB
SNMP object: SSB-SNMP-MIB::ssbUnusedLogStorageCapacity
Type: Integer (% percent)

Community (v2c) /

Context (v3)

Data
Short description: Ratio of free space on SSB compared to the Disk space fill up prevention limit.

Description: The available free space on SSB.

Caution:

Hazard of data loss If the value of this parameter is constantly close to 0%, fine-tune your configuration or purchase more SSB appliances. For assistance, contact our Support Team.

If the value of this parameter reaches 0%, SSB will stop receiving logs.

If you have an Archive Policy configured, archiving will start after the value of this parameter reaches 0%. Therefore, SSB might start receiving logs again after some time has passed.

Make sure that you always have enough free space.

The definition of "enough" varies based on your specific configuration settings, for example:

  • The disk size of your SSB appliance.
  • The size, number and frequency of your incoming logs.
  • Your Policies > Backup & Archive/Cleanup settings configuration.
  • Your Basic Settings > Management > Disk space fill up prevention limit configuration. For details, see Preventing disk space fill up
  • and so on
Example: Available free space on SSB

To calculate the available free space on SSB, the following formula is used:

[Disk capacity of the core partition] - [The free space above the Basic Settings > Management > Disk space fill up prevention limit] - [The space that is already in use].

For example:

  • Disk capacity of the core partition: This is always 100%
  • The free space above the Basic Settings > Management > Disk space fill up prevention limit: If SSB is configured to Disconnect clients when disks are 90 percent used, this value is 100% - 90% = 10%
  • The space that is already in use: 35%

Available free space on SSB = 100% - 10% - 35% = 55%

For which systems and configurations is it applicable? Applicable for all configurations and systems.
Value change frequency Its value is continuously decreasing, depending on available log storage capacity.
Related issues and issue indicators As the returned value approaches 0, the available log storage capacity is continuously decreasing.

Solution:

Documentos relacionados

The document was helpful.

Selecione a classificação

I easily found the information I needed.

Selecione a classificação