SNMP object: |
UCD-SNMP-MIB::laLoad |
Community (v2c) /
Context (v3) |
Data and system |
CPU load averages (or system load averages) is the average load of syslog-ng Store Box(SSB)'s CPUs and the size of the task queue, during the past 1, 5, and 15 minutes, respectively.
If the load is constantly equal to or higher than the number of CPUs in your appliance, fine-tune your configuration or purchase more SSB appliances. For assistance, contact our Support Team.
If you query UCD-SNMP-MIB::laTable, the table of returned values will contain the returned values you would get when querying UCD-SNMP-MIB::laLoad, but in a more structured, easy-to-read format.
For which systems and configurations is it applicable? |
Applicable for all configurations and systems. |
Value change frequency |
Its value is continuously changing. |
Related issues and issue indicators |
Load too high (see Monitoring SSB's CPU). |
Solution:
- Decrease load.
- Purchase a new SSB appliance.
User CPU time:
SNMP object: |
UCD-SNMP-MIB::ssCpuUser |
Community (v2c) /
Context (v3) |
Data and system |
If processor is not in idle (for example, there is live log traffic or report generation), it is quite normal that the majority of the CPU time is spent on running user space processes.
For which systems and configurations is it applicable? |
Applicable for all configurations and systems. |
Value change frequency |
Its value is continuously changing. |
Related issues and issue indicators |
If the returned value is too high, it indicates a highly loaded CPU on user space. |
Solution:
System CPU time:
SNMP object: |
UCD-SNMP-MIB::ssCpuSystem |
Community (v2c) /
Context (v3) |
Data and system |
The amount of time spent in the kernel should be as low as possible. Ideally, around 0.5% of the time given to the different processes is spent in the kernel. This number can peak much higher, especially when there are a lot of I/O activities.
For which systems and configurations is it applicable? |
Applicable for all configurations and systems. |
Value change frequency |
Its value is continuously changing. |
Related issues and issue indicators |
If the returned value is too high, it indicates high kernel-intensive operations running on the CPU. |
Solution:
Idle CPU time:
SNMP object: |
UCD-SNMP-MIB::ssCpuIdle |
Community (v2c) /
Context (v3) |
Data and system |
The total of the user CPU time percentage and the idle CPU percentage should be close to 100%. If the CPU spends a lot more time in other states, it is worth investigating the root cause, because it can indicate issues.
For which systems and configurations is it applicable? |
Applicable for all configurations and systems. |
Value change frequency |
Its value is continuously changing. |
Monitoring the details of your CPU usage:
To fine-tune monitoring your CPU usage, you can use the following values. It is also possible to generate a chart from these values.
Raw CPU user time |
UCD-SNMP-MIB::ssCpuRawUser |
Raw CPU nice time |
UCD-SNMP-MIB::ssCpuRawNice |
Raw CPU system time |
UCD-SNMP-MIB::ssCpuRawSystem |
Raw CPU idle time |
UCD-SNMP-MIB::ssCpuRawIdle |
Raw CPU wait time |
UCD-SNMP-MIB::ssCpuRawWait |
Raw CPU kernel time |
UCD-SNMP-MIB::ssCpuRawKernel |
Raw CPU interrupt time |
UCD-SNMP-MIB::ssCpuRawInterrupt |
Raw CPU Soft IRQ time |
UCD-SNMP-MIB::ssCpuRawSoftIRQ |
Raw CPU steal time |
UCD-SNMP-MIB::ssCpuRawSteal |
Raw CPU guest time |
UCD-SNMP-MIB::ssCpuRawGuest |
Raw CPU guest nice time |
UCD-SNMP-MIB::ssCpuRawGuestNice |
Community (v2c) /
Context (v3) |
Data and system for all of the above |
Disk I/O per partition
SNMP object: |
UCD-DISKIO-MIB::diskIOTable |
Community (v2c) /
Context (v3) |
Data and system |
-
sda
If the 15-minute load (for details, see Monitoring CPU load averages) is getting close to 90%, your system does not have enough resources and you probably need to purchase more syslog-ng Store Box(SSB) appliances. For assistance, contact our Support Team.
-
sdb
NOTE: This is only available on legacy SSB (syslog-ng Store Box T1) appliances.
If the 15-minute load (for details, see Monitoring CPU load averages) is getting close to 90%, your system does not have enough resources and you probably need to purchase more SSB appliances. For assistance, contact our Support Team.
For which systems and configurations is it applicable? |
Applicable for all configurations and systems. |
Value change frequency |
Its value changes quite often, depending on the I/O load and its type. |
Related issues and issue indicators |
When I/O load is too high, the system slows down. |
Solution:
Interfaces I/O by interface name
SNMP object: |
RFC1213-MIB::ifTable |
Community (v2c) /
Context (v3) |
Data and system |
The following interfaces can be monitored (the type of traffic that can affect the load):
-
eth0 - external (network, redundant HA, next hop monitoring)
-
eth1 - management (network, redundant HA, next hop monitoring)
-
eth2 - internal (redundant, next hop monitoring)
-
eth3 - HA
If the load on an interface seems to be too high, check whether you have configured SSB in a way that affects that node. For example, if you do not use a management interface, the load on the external interface can be higher. Or, configuring next hop monitoring can also increase the load on an interface.
RFC1213-MIB:ifTable
For which systems and configurations is it applicable? |
Applicable for all configurations and systems. |
Value change frequency |
Its value is continuously changing, depending on incoming logs and DRBD sync. |
Related issues and issue indicators |
I/O load may become too high on network interfaces, which may result in log loss, slow sync, and HA in degraded mode. |
Solution:
RFC1213-MIB:ifTable - ETH 0, ETH3
For which systems and configurations is it applicable? |
Applicable for all configurations and systems. |
Value change frequency |
Its value is continuously changing, depending on the number of incoming logs. |
Related issues and issue indicators |
If the I/O load is too high, the network will not handle it, which may result in log loss. |
|
Caution:
Hazard of data loss If the I/O load becomes too high for the network to handle, it may cause log loss. To avoid log loss, reconsider your configuration settings, Alternatively, reconsider your configuration settings, upgrading the capacity of your SSB appliance, purchasing more SSB appliances, or contact our Support Team. |
Solution:
RFC1213-MIB:ifTable - ETH3
For which systems and configurations is it applicable? |
Only applicable for HA clusters. |
Value change frequency |
Its value is continuously changing, depending on the number of incoming logs. |
Related issues and issue indicators |
Network traffic load too high for the NIC to handle.
(It rarely ever happens.) |
Solution:
SSB's version number
SNMP object: |
SSB-SNMP-MIB::ssbFirmwareVersion |
Type: |
String |
Community (v2c) /
Context (v3) |
Data |
Short description: |
Current version of the syslog-ng Store Box(SSB). |
Description: The current version number of SSB. This always changes after a successful upgrade.
Number of session files on SSB
SNMP object: |
SSB-SNMP-MIB::ssbHTTPSessions |
Type: |
Integer32 |
Community (v2c) /
Context (v3) |
Data |
Short description: |
Number of recently active HTTP-based connections to SSB. |
Description: The number of session files on SSB. These are generated as a result of the following events:
- Accessing the web user interface of SSB.
- Accessing a remote logspace.
- Performing an RPC API call.
For which systems and configurations is it applicable? |
Applicable for all configurations and systems. |
Value change frequency |
Its value is continuously changing, depending on the number of active connections. |
Related issues and issue indicators |
If the returned value changes too often within a short period of time, it can indicate a brute force attack. |
Solution:
Number of core files on SSB
SNMP object: |
SSB-SNMP-MIB::ssbCoreFiles |
Type: |
Integer (number of) |
Community (v2c) /
Context (v3) |
Data |
Short description: |
The number of core files in SSB's core firmware. |
Description: If the value of this parameter is larger than 0, contact our Support Team.
For which systems and configurations is it applicable? |
Applicable for all configurations and systems, but unless a core file is generated, its returned value is 0. |
Value change frequency |
Its value does not change often, only when a core file is generated. |
Related issues and issue indicators |
Even a single core file indicates an issue. When more than one of them appear, it indicates a more serious issue. |
Solution:
Available free space on SSB
SNMP object: |
SSB-SNMP-MIB::ssbUnusedLogStorageCapacity |
Type: |
Integer (% percent) |
Community (v2c) /
Context (v3) |
Data |
Short description: |
Ratio of free space on SSB compared to the Disk space fill up prevention limit. |
Description: The available free space on SSB.
|
Caution:
Hazard of data loss If the value of this parameter is constantly close to 0%, fine-tune your configuration or purchase more SSB appliances. For assistance, contact our Support Team.
If the value of this parameter reaches 0%, SSB will stop receiving logs.
If you have an Archive Policy configured, archiving will start after the value of this parameter reaches 0%. Therefore, SSB might start receiving logs again after some time has passed. |
Make sure that you always have enough free space.
The definition of "enough" varies based on your specific configuration settings, for example:
- The disk size of your SSB appliance.
- The size, number and frequency of your incoming logs.
- Your Policies > Backup & Archive/Cleanup settings configuration.
- Your Basic Settings > Management > Disk space fill up prevention limit configuration. For details, see Preventing disk space fill up
- and so on
Example: Available free space on SSB
To calculate the available free space on SSB, the following formula is used:
[Disk capacity of the core partition] - [The free space above the Basic Settings > Management > Disk space fill up prevention limit] - [The space that is already in use].
For example:
- Disk capacity of the core partition: This is always 100%
- The free space above the Basic Settings > Management > Disk space fill up prevention limit: If SSB is configured to Disconnect clients when disks are 90 percent used, this value is 100% - 90% = 10%
- The space that is already in use: 35%
Available free space on SSB = 100% - 10% - 35% = 55%
For which systems and configurations is it applicable? |
Applicable for all configurations and systems. |
Value change frequency |
Its value is continuously decreasing, depending on available log storage capacity. |
Related issues and issue indicators |
As the returned value approaches 0, the available log storage capacity is continuously decreasing. |
Solution: