The following workarounds can be performed to get the SSB to a healthy state:
A. Check if the boot firmware is accessible or not. If yes, try to connect from the boot shell to the core shell, by typing the below command:
#core-shell
a. If the core shell is accessible from the boot shell, run the below command to see the failed processes:
#systemctl --failed
B
. If the core shell is not accessible from the boot shell, the core shell can be restarted from the boot shell by the following command:
C. If the SSB is a High Availability cluster, the core shell can be restarted by performing a takeover (from master to slave). Every node has it's own core shell, and by switching from master to slave, the core shell will be unmounted on the master, and will be started on the slave.
The takeover and the core shell restart won't take more than a 2-3 minutes, however it is advised to perform it, when very little or no messages are expected to be sent to the SSB.Once the core shell has been restarted (by restarting the core shell from boot shell, or by taking over from master node to slave), the failure message should disappear.