Chat now with support
Chat with Support

syslog-ng Store Box 6.10.0 - Administration Guide

Preface Introduction The concepts of SSB The Welcome Wizard and the first login Basic settings User management and access control Managing SSB Configuring message sources Storing messages on SSB Forwarding messages from SSB Log paths: routing and processing messages Configuring syslog-ng options Searching log messages Searching the internal messages of SSB Classifying messages with pattern databases The SSB RPC API Monitoring SSB Troubleshooting SSB Security checklist for configuring SSB Glossary

Monitoring the HA cluster

The status of the HA cluster
SNMP object: SSB-SNMP-MIB::ssbHAClusterStatus
Type: String

Community (v2c) /

Context (v3)

Data
Short description: Status of the HA cluster.

Description: The status of the syslog-ng Store Box(SSB) cluster. For details, see Status.

For which systems and configurations is it applicable? Only applicable for HA clusters.
Value change frequency When the HA cluster functions properly, this SNMP object should be in ha status in the majority of the cases. The rest of the status returned values (for example, degraded) may also occur occasionally, but the ha status should be dominant as a rule.
Related issues and issue indicators If while in a HA cluster, the status returned value is not ha or sync, the HA cluster is in degraded mode.

Solution:

  • Check your HA network.
  • Reboot the Secondary node.
  • Reboot the HA cluster.
  • For technical assistance, contact our Support Team.
The status of the Redundant Heartbeat interface
SNMP object: SSB-SNMP-MIB::ssbHARedundantHeartbeatStatus
Type: String

Community (v2c) /

Context (v3)

Data
Short description: Status of the Redundant Heartbeat interface.

Description: The status of the Redundant Heartbeat interface. For details, see Redundant Heartbeat status.

For which systems and configurations is it applicable? Only applicable for HA clusters, but it only has a returned value if Redundant HA is configured.
Value change frequency When the cluster functions properly, it should be in ok returned value status in the majority of the cases. The rest of the status returned values (for example, degraded) may also occur occasionally, but the ok status should be dominant as a rule.
Related issues and issue indicators Sometimes this SNMP objects has an ok status, but the HAClusterStatus is not ok. The HA cluster will function properly in this case, too.
The synchronization progress of HA nodes
SNMP object: SSB-SNMP-MIB::ssbHASynchronizationProgress
Type: Integer32 (0..100 %)

Community (v2c) /

Context (v3)

Data
Short description: HA cluster synchronization progress (in percent). 100%, if the cluster is fully synchronized.

Description: This value can be important in the following cases:

  • When enabling HA mode the first time, after navigating to Basic Settings > High Availability and clicking Convert to Cluster, the synchronization process starts. This value will start at 0% and will gradually increase to 100%. When it reaches 100%, it means that the conversion has been finished and the nodes are now in HA status.

  • If one of your nodes becomes unavailable and you decide to reinstall SSB, you will have to rejoin your cluster again by navigating to Basic Settings > High Availability and clicking Join HA. This will start the synchronization progress from 0% again and will gradually increase to 100%. When it reaches 100%, it means that the join progress has been finished and the nodes are now in HA status again.

  • If a node becomes unavailable for a longer period and then gets joined again, it can be possible that the configuration of the two nodes become different. In this case, the two nodes start the synchronization process again so that the new changes are transferred to the previously unavailable node. This does not necessarily mean that the synchronization value will start at 0%, it is possible that it starts from a number somewhere between 0% and 100%.

For which systems and configurations is it applicable? Only applicable for HA clusters.
Value change frequency Following a conversion to a HA cluster, its returned value continuously increases till reaching 100%. After reaching 100%, its returned value rarely changes - or does not change at all.
Related issues and issue indicators When 100% has not yet been reached, but the process still does not change for a long time.

Solution:

  • Check your HA network.
  • Reboot the Secondary node.
  • Reboot the HA cluster.
  • For technical assistance, contact our Support Team.
Determining whether the HA node is the primary node
SNMP object: SSB-SNMP-MIB::ssbHAIsPrimary
Type: TruthValue (SNMP boolean value)

Community (v2c) /

Context (v3)

System
Short description: The current HA node is the primary node

Description: This information is only supplied on HA-cluster nodes and it is available on the SNMP community provided by the boot-firmwares (the ID-based communities on the Basic Settings > Monitoring > SNMP agent settings page).

You can monitor which node is the primary HA node, that is, which node is responsible for SSB's business logic. For example, HTTP configuration, log management (syslog-ng, archive, backup), and so on.

Monitoring hardware RAID

SNMP object: SSB-SNMP-MIB::ssbHardwareRaid
Type: This is a grouping node

Community (v2c) /

Context (v3)

System
Short description: Detailed information about hardware Raid devices

Description: Monitor syslog-ng Store Box(SSB)'s hardware RAID, which is responsible for providing disk availability (https://en.wikipedia.org/wiki/RAID#Hardware-based), for example, if a disk fails. It is used to monitor the status of the disks in an SSB appliance.

Available on SSB appliances (except T1), on the SNMP community provided by the boot-firmwares, that is, the ID-based communities on the Basic Settings > Monitoring > SNMP agent settings page.

RAID controller battery state
SNMP object: SSB-SNMP-MIB::ssbHardwareRaidBatteryState
Type: string

Community (v2c) /

Context (v3)

System
Short description: The battery state of the raid controller. The value of State from Cachevault_Info, or BBU_Info table of StorCLI.
For which systems and configurations is it applicable? When this particular hardware supports this particular (T4, T10, S, M) RAID type.
Value change frequency Its returned value is not supposed to change. When it does, it indicates an issue.
Related issues and issue indicators The issue generally occurs due to power outage or hardware error (for example, natural battery amortization).

Solution:

  • For technical assistance, contact our Support Team.
  • Check power supply (for example, check if the power cord is damaged, or if the machine is running from battery, and so on).
RAID controller firmware version
SNMP object: SSB-SNMP-MIB::ssbHardwareRaidControllerFirmwareVersion
Type: string

Community (v2c) /

Context (v3)

System
Short description: Version of the controller's firmware. This value is reported by StorCLI.
For which systems and configurations is it applicable? When this particular hardware does not support this particular (T4, T10, S, M) RAID type.
Value change frequency Not too often, only in case of RAID firmware update.
Hardware RAID status
SNMP object: SSB-SNMP-MIB::ssbHardwareRaidStatus
Type: string

Community (v2c) /

Context (v3)

System
Short description: Status of the hardware raid.
For which systems and configurations is it applicable? When this particular hardware supports this particular (T4, T10, S, M) RAID type.
Value change frequency Its value does not change too often.
Related issues and issue indicators A status returned value other than the optimal active indicates that the RAID is in degraded mode.
Hardware RAID synchronization progress
SNMP object: SSB-SNMP-MIB::ssbHardwareRaidSyncRate
Type: Integer32

Community (v2c) /

Context (v3)

System
Short description: Progress of hardware raid synchronization (in percent).
For which systems and configurations is it applicable? When this particular hardware supports this particular (T4, T10, S, M) RAID type.
Value change frequency When not syncing, it has no returned value. Otherwise, its value is continuously changing and may drop when resyncing.
Related issues and issue indicators When the progress does not change for a longer period of time, it indicates an issue.

Solution:

StorCLI's PD LIST (Physical Drive)

Device ID (DID)
SNMP object: SSB-SNMP-MIB::ssbHardwareRaidPdDeviceID
Type: Integer32

Community (v2c) /

Context (v3)

System
Short description: The value of StorCLI's 'DID' output in 'PD LIST'.
Drive Group (DG)
SNMP object: SSB-SNMP-MIB::ssbHardwareRaidPdDriveGroup
Type: string

Community (v2c) /

Context (v3)

System
Short description: The value of StorCLI's 'DG' output in 'PD LIST'.
Enclosure Device ID and Slot Number (EID:Slt)
SNMP object: SSB-SNMP-MIB::ssbHardwareRaidPdEnclosureDeviceIDAndSlotNumber
Type: string

Community (v2c) /

Context (v3)

System
Short description: The value of StorCLI's 'EID:Slt' output in 'PD LIST'.
Interface (Intf)
SNMP object: SSB-SNMP-MIB::ssbHardwareRaidPdInterface
Type: string

Community (v2c) /

Context (v3)

System
Short description: The value of StorCLI's 'Intf' output in 'PD LIST'.
Media Type (Med)
SNMP object: SSB-SNMP-MIB::ssbHardwareRaidPdMediaType
Type: string

Community (v2c) /

Context (v3)

System
Short description: The value of StorCLI's 'Med' output in 'PD LIST'.
Model
SNMP object: SSB-SNMP-MIB::ssbHardwareRaidPdModel
Type: string

Community (v2c) /

Context (v3)

System
Short description: The value of StorCLI's 'Model' output in 'PD LIST'.
Protection Info (PI)
SNMP object: SSB-SNMP-MIB::ssbHardwareRaidPdProtectionInfo
Type: string

Community (v2c) /

Context (v3)

System
Short description: The value of StorCLI's 'PI' output in 'PD LIST'.
Sector Size (SeSz)
SNMP object: SSB-SNMP-MIB::ssbHardwareRaidPdSectorSize
Type: string

Community (v2c) /

Context (v3)

System
Short description: The value of StorCLI's 'SeSz' output in 'PD LIST'.
Self Encrypting Drive (SED)
SNMP object: SSB-SNMP-MIB::ssbHardwareRaidPdSelfEncryptingDrive
Type: string

Community (v2c) /

Context (v3)

System
Short description: The value of StorCLI's 'SED' output in 'PD LIST'.
Size
SNMP object: SSB-SNMP-MIB::ssbHardwareRaidPdSize
Type: string

Community (v2c) /

Context (v3)

System
Short description: The value of StorCLI's 'Size' output in 'PD LIST'. The size of the physical drive.
Spun (Sp)
SNMP object: SSB-SNMP-MIB::ssbHardwareRaidPdSpun
Type: string

Community (v2c) /

Context (v3)

System
Short description: The value of StorCLI's 'Sp' output in 'PD LIST'.
State
SNMP object: SSB-SNMP-MIB::ssbHardwareRaidPdState
Type: string

Community (v2c) /

Context (v3)

System
Short description: The value of StorCLI's 'State' output in 'PD LIST'.
SSB-SNMP-MIB::ssbHardwareRaidPDTable
For which systems and configurations is it applicable? When this particular hardware supports this particular (T4, T10, S, M) RAID type.
Value change frequency The table's contents are not supposed to change (except for when one or more disks within the RAID fail, or in case of disk error).
Related issues and issue indicators In case of hardware error or disk error, one or more disks within the RAID may fail. In this case, the HotSpare will take the faulty disks' place and it is advisable to find out what caused disk failure.

Solution:

StorCLI's Drive State

Drive Temperature
SNMP object: SSB-SNMP-MIB::ssbHardwareRaidDriveTemperature
Type: string

Community (v2c) /

Context (v3)

System
Short description: The value of StorCLI's 'Drive Temperature' output from the 'Drive State' information list.
Media Error Count
SNMP object: SSB-SNMP-MIB::ssbHardwareRaidMediaErrorCount
Type: Integer32

Community (v2c) /

Context (v3)

System
Short description: The value of StorCLI's 'Media Error Count' output from the 'Drive State' information list.
Other Error Count
SNMP object: SSB-SNMP-MIB::ssbHardwareRaidOtherErrorCount
Type: Integer32

Community (v2c) /

Context (v3)

System
Short description: The value of StorCLI's 'Other Error Count' output from the 'Drive State' information list.
Predictive Failure Count
SNMP object: SSB-SNMP-MIB::ssbHardwareRaidPredictiveFailureCount
Type: Integer32

Community (v2c) /

Context (v3)

System
Short description: The value of StorCLI's 'Predictive Failure Count' output from the 'Drive State' information list.
Shield Counter
SNMP object: SSB-SNMP-MIB::ssbHardwareRaidShieldCounter
Type: Integer32

Community (v2c) /

Context (v3)

System
Short description: The value of StorCLI's 'Shield Counter' output from the 'Drive State' information list.
S.M.A.R.T alert flagged by drive
SNMP object: SSB-SNMP-MIB::ssbHardwareRaidSmartAlertFlaggedByDrive
Type: string

Community (v2c) /

Context (v3)

System
Short description: The value of StorCLI's 'S.M.A.R.T alert flagged by drive' output from the 'Drive State' information list.
S.M.A.R.T BBM Error Count
SNMP object: SSB-SNMP-MIB::ssbHardwareRaidSmartBBMErrorCount
Type: Integer32

Community (v2c) /

Context (v3)

System
Short description: The value of StorCLI's 'BBM Error Count' output from the 'Drive State' information list. This information only present on machines T4, and T10.
S.M.A.R.T Enclosure Device ID and Slot Number (EID:Slt)
SNMP object: SSB-SNMP-MIB::ssbHardwareRaidSmartEnclosureDeviceIDAndSlotNumber
Type: string

Community (v2c) /

Context (v3)

System
Short description: The value of StorCLI's 'EID:Slt' output from the 'Drive' information list.
Related Documents

The document was helpful.

Select Rating

I easily found the information I needed.

Select Rating