Chat now with support
Chat with Support

One Identity Safeguard for Privileged Sessions 6.0.6 - Administration Guide

Preface Introduction The concepts of One Identity Safeguard for Privileged Sessions (SPS) The Welcome Wizard and the first login Basic settings
Supported web browsers and operating systems The structure of the web interface Network settings Configuring date and time System logging, SNMP and e-mail alerts Configuring system monitoring on SPS Data and configuration backups Archiving and cleanup Forwarding data to third-party systems Joining to One Identity Starling
User management and access control Managing One Identity Safeguard for Privileged Sessions (SPS)
Controlling One Identity Safeguard for Privileged Sessions (SPS): reboot, shutdown Managing Safeguard for Privileged Sessions (SPS) clusters Managing a high availability One Identity Safeguard for Privileged Sessions (SPS) cluster Upgrading One Identity Safeguard for Privileged Sessions (SPS) Managing the One Identity Safeguard for Privileged Sessions (SPS) license Accessing the One Identity Safeguard for Privileged Sessions (SPS) console Sealed mode Out-of-band management of One Identity Safeguard for Privileged Sessions (SPS) Managing the certificates used on One Identity Safeguard for Privileged Sessions (SPS)
General connection settings HTTP-specific settings ICA-specific settings RDP-specific settings SSH-specific settings Telnet-specific settings VMware Horizon View connections VNC-specific settings Indexing audit trails Using the Search interface Searching session data on a central node in a cluster Advanced authentication and authorization techniques Reports The One Identity Safeguard for Privileged Sessions (SPS) RPC API The One Identity Safeguard for Privileged Sessions (SPS) REST API One Identity Safeguard for Privileged Sessions (SPS) scenarios Troubleshooting One Identity Safeguard for Privileged Sessions (SPS) Using SPS with SPP Configuring external devices Using SCP with agent-forwarding Security checklist for configuring One Identity Safeguard for Privileged Sessions (SPS) Jumplists for in-product help LDAP user and group resolution in SPS Appendix: Deprecated features Glossary

Recovering from a split brain situation

A split brain situation is caused by a temporary failure of the network link between the cluster nodes, resulting in both nodes switching to the active (that is, primary node) role while disconnected. This might cause new data (for example, audit trails) to be created on both nodes without being replicated to the other node. Thus, it is likely in this situation that two diverging sets of data have been created, which cannot be trivially merged.

Caution:

Hazard of data loss In a split brain situation, valuable audit trails might be available on both One Identity Safeguard for Privileged Sessions (SPS) nodes, so special care must be taken to avoid data loss.

The nodes of the SPS cluster automatically recognize the split brain situation once the connection between the nodes is reestablished, and do not perform any data synchronization to prevent data loss. When a split brain situation is detected, it is visible on the SPS system monitor, in the system logs (Split-Brain detected, dropping connection!), on the Basic Settings > High Availability page, and SPS sends an alert as well.

Once the network connection between the nodes has been re-established, one of the nodes will become the active (that is, primary) node, while the other one will be the backup node (that is, the secondary node). This means that one node is providing services similar to normal operation, and the other one is kept passive (as a backup) to avoid network interferences. Note that there is no synchronization between the nodes at this stage.

To recover a SPS cluster from a split brain situation, complete the following steps.

Caution:

Do NOT shut down the nodes.

Data recovery

In the procedure described here, data will be saved from the host currently acting as the secondary node host. This is required because data on this host will later be overwritten by the data available on the current primary node.

NOTE:

During data recovery, there will be no service provided by SPS.

To recover from a split brain situation

  1. Log in to the primary node. If no Console menu is showing up after login, then this is the secondary node. In this case, try the other node.

  2. Select Shells > Boot Shell.

  3. Enter /usr/share/heartbeat/hb_standby. This will change the current secondary node to primary node and the current primary node to secondary node (HA failover).

  4. Exit the console.

  5. Wait a few seconds for the HA failover to complete.

  6. Log in on the other host. If no Console menu is showing up, the HA failover has not completed yet. Wait a few seconds and try logging in again.

  7. Select Shells > Core Shell.

  8. Issue the systemctl stop zorp-core.service command to disable all traffic going through SPS.

  9. Save the files from /var/lib/zorp/audit that you want to keep. Use scp or rsync to copy data to your remote host.

    TIP:

    To find the files modified in the last n*24 hours, use find . -mtime -n.

    To find the files modified in the last n minutes, use find . -mmin -n .

  10. Enter:

    pg_dump -U scb -f /root/database.sql

    Back up the /root/database.sql file.

  11. Exit the console.

  12. Log in again, and select Shells > Boot Shell.

  13. Enter /usr/share/heartbeat/hb_standby. This will change the current secondary node to primary node and the current primary node to secondary node (HA failover).

  14. Exit the console.

  15. Wait a few minutes to let the failover happen, so the node you were using will become the secondary node and the other node will become the primary node.

    The nodes are still in a split-brain state but now you have all the data backed up from the secondary node, and you can synchronize the data from the primary node to the secondary node, which will turn the HA state from "Split-brain" to "HA". For details on how to do that, see HA state recovery.

HA state recovery

In the procedure described here, the "Split-brain" state will be turned to the "HA" state. Keep in mind that the data on the current primary node will be copied to the current secondary node and data that is available only on the secondary node will be lost (as that data will be overwritten).

Steps: Swapping the nodes (optional)

NOTE:

If you completed the procedure described in Data recovery, you do not have to swap the nodes. You can proceed to the steps about data synchronization.

If you want to swap the two nodes to make the primary node the secondary node and the secondary node the primary node, perform the following steps:

  1. Log in to the primary node. If no Console menu is showing up after login, then this is the secondary node. In this case, try the other node.

  2. Select Shells > Boot Shell.

  3. Enter /usr/share/heartbeat/hb_standby. This will output:

    Going standby [all]
  4. Exit the console.

  5. Wait a few minutes to let the failover happen, so the node you were using will become the secondary node and the other node will be the primary node.

Steps: Initializing data synchronization

To initialize data synchronization, complete the following steps:

  1. Log in to the secondary node. If the Console menu is showing up, then this is the primary node. In this case, try logging in to the other node.

  2. Enter the following commands. These commands will make the secondary node discard the data available only here, on this node.

    drbdadm secondary r0
    drbdadm connect --discard-my-data r0
  3. Log out of the secondary node.

  4. Log in to the primary node.

  5. Select Shells > Boot Shell.

  6. Enter:

    drbdadm connect r0
  7. Exit the console.

  8. Check the High Availability state on the web interface of SPS, in the Basic Settings > High Availability > Status field. During synchronization, the status will say Degraded Sync, and after the synchronization completes, it will say HA.

Replacing a HA node in a One Identity Safeguard for Privileged Sessions (SPS) cluster

The following describes how to replace a unit in a One Identity Safeguard for Privileged Sessions (SPS) cluster with a new appliance.

To replace a unit in a SPS cluster with a new appliance

  1. Verify the HA status on the working node. Select Basic Settings > High Availability. If one of the nodes has broken down or is missing, the Status field displays DEGRADED.

  2. Note down the Gateway IP addresses, and the IP addresses of the Heartbeat and the Next hop monitoring interfaces.

  3. Perform a full system backup. Before replacing the node, create a complete system backup of the working node. For details, see Data and configuration backups.

  4. Check which firmware version is running on the working node. Select Basic Settings > System > Version details and write down the exact version numbers.

  5. Log in to your support portal and download the CD ISO for the same SPS version that is running on your working node.

  6. Without connecting the replacement unit to the network, install the replacement unit from the ISO file. Use the IPMI interface if needed.

  7. When the installation is finished, connect the two SPS units with an Ethernet cable via the Ethernet connectors labeled as 4 or HA.

  8. Reboot the replacement unit and wait until it finishes booting.

  9. Login to the working node and verify the HA state. Select Basic Settings > High Availability. The Status field should display HALF.

  10. Reconfigure the Gateway IP addresses, and the IP addresses of the Heartbeat and the Next hop monitoring interfaces. Click Commit.

  11. Click Other node > Join HA.

  12. Click Other node > Reboot.

  13. The replacement unit will reboot and start synchronizing data from the working node. The Basic Settings > High Availability > Status field will display DEGRADED SYNC until the synchronization finishes. Depending on the size of the hard disks and the amount of data stored, this can take several hours.

  14. After the synchronization is finished, connect the other Ethernet cables to their respective interfaces (external to 1 or EXT, internal to 3 or INT, management to 2 or MGMT) as needed for your environment.

    Expected result

    A node of the SPS cluster is replaced with a new appliance.

Resolving an IP conflict between cluster nodes

The IP addresses of the HA interfaces connecting the two nodes are detected automatically, during boot. When a node comes online, it attempts to connect to the IP address 1.2.4.1. If no other node responds until timeout, then it sets the IP address of its HA interface to 1.2.4.1, otherwise (if there is a responding node on 1.2.4.1) it sets its own HA interface to 1.2.4.2.

Replaced nodes do not yet know the HA configuration (or any other HA settings), and will attempt to negotiate it automatically in the same way. If the network is, for any reason, too slow to connect the nodes on time, the replacement node boots with the IP address of 1.2.4.1, which can cause an IP conflict if the other node has also set its IP to that same address previously. In this case, the replacement node cannot join the HA cluster.

To manually assign the correct IP address to the HA interface of a node, perform the following steps:

  1. Log in to the node using the IPMI interface or the physical console.

    Configuration changes have not been synced to the new (replacement) node, as it could not join the HA cluster. Use the default password of the root user of One Identity Safeguard for Privileged Sessions (SPS), see "Installing the SPS hardware" in the Installation Guide.

  2. From the console menu, choose 10 HA address.

    Figure 298: The console menu

  3. Choose the IP address of the node.

    Figure 299: The console menu

  4. Reboot the node.

Understanding One Identity Safeguard for Privileged Sessions (SPS) RAID status

This section explains the possible statuses of the One Identity Safeguard for Privileged Sessions (SPS) RAID device and the underlying hard disks. SPS displays this information on the Basic Settings > High Availability page. The following statuses can occur:

Related Documents

The document was helpful.

Select Rating

I easily found the information I needed.

Select Rating