The following information provides insight into how SPP processes access requests during the cluster patching process. It also describes what happens if a cluster member loses power or network connectivity during the patching process.

Service guarantees

During a cluster upgrade, the cluster is split logically into the current version (side A) and the upgrade version (side B). Access request workflow is only enabled on one side at a time. Audit logs run on both sides and merge when the cluster patch completes. Initially, access request workflow is only enabled on side A, and replicas in PatchPending state can perform access requests. As appliances upgrade and move to side B, the access workflow migrates to side B when side B has a majority of the appliances.

At this point in the upgrade process, replicas in PatchPending state can no longer perform access requests; however, all of the upgraded cluster members can perform access requests. There is a small window where access request workflow is unavailable as the data migrates from one side to the other.

Failure scenarios

If the primary appliance loses power or loses network connectivity during the upgrade process, it will try to resume the upgrade on restart.

If a replica is disconnected or loses power during an upgrade process, the replica will most likely go into quarantine mode. The primary appliance will skip that appliance and remove it from the cluster. This replica will need to be reset, upgraded, and then re-enrolled into the cluster manually to recover.

Configuration for password and SSH key check out

The policy may be configured such that a password or SSH key reset is required before the password or SSH key can be checked out again. If that is the case, the following can be temporarily configured prior to cluster patching and access request to allow for password or SSH key check out when a password or SSH key has not been reset.

  • The policy can be set to allow multiple accesses.
  • The policy can be set to not require a password or SSH key change at check in.
  • Emergency requests can be allowed so the user does not have to wait for the password or SSH key to be reset.