The goal of HA clusters is to support enterprise business continuity by providing location-independent load balancing and failover.

In High Availability (HA) mode, two One Identity Safeguard for Privileged Sessions (SPS) units with identical configurations are operating simultaneously. These two units are the primary node and the secondary node (previously also referred to as the master node and the slave node). The primary node shares all data with the secondary node, and if the primary node stops functioning, the other one becomes immediately active, so the servers are continuously accessible.

NOTE: To ensure the stability of the connection, One Identity recommends a direct physical connection between the nodes in the HA cluster. Gratuitous ARP requests are sent to inform hosts on the local network that the MAC addresses behind these IP addresses have changed.

The primary node shares all data with the secondary node using the HA network interface (labeled as 4 or HA on the SPS appliance). The disks of the primary and the secondary node must be synchronized for the HA support to operate correctly. Interrupting the connection between running nodes (unplugging the Ethernet cables, rebooting a switch or a router between the nodes, or disabling the HA interface) disables data synchronization and forces the secondary node to become active. This might result in data loss. You can find instructions to resolve such problems and recover a SPS cluster in Troubleshooting a One Identity Safeguard for Privileged Sessions (SPS) cluster.

NOTE: HA functionality was designed for physical SPS units. If SPS is used in a virtual environment, use the fallback functionalities provided by the virtualization service instead.

The Basic Settings > High Availability page provides information about the status of the HA cluster and its nodes.

Figure 133: Basic Settings > High Availability — Managing a High Availability cluster

The following information is available about the cluster:

The active (that is, primary) SPS node is labeled as This node. This unit inspects the SSH traffic and provides the web interface. The SPS unit labeled as Other node is the secondary node that is activated if the primary node becomes unavailable.

The following information is available about each node:

  • Node ID: The MAC address of the HA interface of the node. This address is also printed on a label on the top cover of the SPS unit.

    For SPS clusters, the IDs of both nodes are included in the internal log messages of SPS. Note that if the central log server is a syslog-ng server, the keep-hostname option should be enabled on the syslog-ng server.

  • Node HA state: Indicates whether the SPS nodes recognize each other properly and whether those are configured to operate in High Availability mode. For details, see Understanding One Identity Safeguard for Privileged Sessions (SPS) cluster statuses.

  • Node HA UUID: A unique identifier of the cluster. Only available in High Availability mode.

  • DRBD status: The status of data synchronization between the nodes. For details, see Understanding One Identity Safeguard for Privileged Sessions (SPS) cluster statuses.

  • RAID status: The status of the RAID device of the node. If it is not Optimal, there is a problem with the RAID device. For details, see Understanding One Identity Safeguard for Privileged Sessions (SPS) RAID status.

  • Firmware version: Version number of the firmware.

  • HA link speed: The maximum allowed speed between the primary node and the secondary node. The HA link's speed must exceed the DRBD sync rate limit, else the web UI might become unresponsive and data loss can occur.

  • Interfaces for Heartbeat: Virtual interface used only to detect that the other node is still available. This interface is not used to synchronize data between the nodes (only heartbeat messages are transferred).

    You can find more information about configuring redundant heartbeat interfaces in Redundant heartbeat interfaces.

  • Next hop monitoring: IP addresses (usually next hop routers) to continuously monitor both the primary node and the secondary node by using ICMP echo (ping) messages. If any of the monitored addresses becomes unreachable from the primary node while being reachable from the secondary node (in other words, more monitored addresses are accessible from the secondary node), then it is assumed that the primary node is unreachable and a forced takeover occurs – even if the primary node is otherwise functional. For details, see Next-hop router monitoring.