High availability with keepalived
High availability (HA) keeps a system running even if some components fail. In an active-passive HA setup, two servers work together:
- The active server handles all requests.
- The passive server stays on standby and takes over if the active server fails.
This guide shows how to configure HA for F5 NGINX Instance Manager using keepalived. This setup includes:
- A virtual IP address (VIP)
- A shared Network File System (NFS)
- Automated health checks to detect failures and trigger failover
Before setting up high availability (HA) for NGINX Instance Manager, make sure you have:
- Two servers with NGINX Instance Manager installed
- A reserved virtual IP address (VIP) that always points to the active instance
- An NFS share that both servers can access
- Permissions to manage IP addresses at the operating system level
keepalivedinstalled on both servers
Some cloud platforms don’t allow direct IP management with keepalived. If you’re using a cloud environment, check whether it supports VIP assignment.
This HA setup has the following restrictions:
- This setup supports only two nodes — one active and one passive. Configurations with three or more nodes are not supported.
- Active/active HA is not supported. This configuration works only in an active-passive setup.
- Do not modify
keepalived. Changes beyond what is documented may cause failures. - OpenID Connect (OIDC) authentication is not supported when NGINX Instance Manager is running in forward-proxy mode. OIDC is configured on the NGINX Plus layer and cannot pass authentication requests through a forward proxy.
A virtual IP address (VIP) ensures that users always connect to the active server. During failover, keepalived automatically moves the VIP from the primary to the secondary server.
- Choose an unused IP address in your network to serve as the VIP.
- Ensure that the IP address does not conflict with existing devices.
- Configure firewalls and security rules to allow traffic to and from the VIP.
- Note the VIP address, as you will reference it in the
keepalived.conffile.
Replace <VIRTUAL_IP_ADDRESS> with this IP when configuring keepalived.
keepalived is a Linux tool that monitors system health and assigns a virtual IP (VIP) to the active server in an HA setup.
Install keepalived on both servers.
-
For Debian-based systems (Ubuntu, Debian):
sh sudo apt update sudo apt install keepalived -y -
For RHEL-based systems (CentOS, RHEL):
sudo yum install keepalived -y
keepalived monitors specific services to determine if a node is operational. Update /etc/nms/scripts/nms-notify-keepalived.sh to include the services you want to monitor.
check_nms_services=(
"clickhouse-server"
"nginx"
"nms-core"
"nms-dpm"
"nms-integrations"
"nms-ingestion"
)Update nms.conf on both nodes when changing mode of operationIf you switch between connected and disconnected modes, you must update /etc/nms/nms.conf on both the primary and secondary nodes ifnms-integrationsis included incheck_nms_services. NGINX Instance Manager runs in connected mode by default. For instructions on changing the mode, see the installation guide for disconnected environments.
Edit /etc/keepalived/keepalived.conf on both servers and replace the placeholders with your actual network details.
vrrp_script nms_check_keepalived {
script "/etc/nms/scripts/nms-check-keepalived.sh"
interval 10
weight 10
}
vrrp_instance VI_28 {
state MASTER # Set to BACKUP on the secondary server
interface <NETWORK_INTERFACE> # Replace with the correct network interface
priority 100
virtual_router_id 251
advert_int 1
authentication {
auth_type PASS
auth_pass <AUTH_PASSWORD> # Replace with a secure password
}
virtual_ipaddress {
<VIRTUAL_IP_ADDRESS> # Replace with your reserved VIP
}
track_script {
nms_check_keepalived
}
notify /etc/nms/scripts/nms-notify-keepalived.sh
}Replace:
<NETWORK_INTERFACE>with your actual network interface (for example,ens32).<AUTH_PASSWORD>with a secure authentication password.<VIRTUAL_IP_ADDRESS>with your reserved VIP.
Ensure the configuration is identical on both servers, except for the state value:
- Set
MASTERon the primary server. - Set
BACKUPon the secondary server.
Restart keepalived to apply the configuration:
sudo systemctl restart keepalivedNGINX Instance Manager requires shared storage for configuration files and logs.
Replace <NFS_SERVER_IP> with the actual IP address of your NFS server in the following commands.
sudo mount -t nfs4 \
-o rw,relatime,vers=4.2, \
rsize=524288,wsize=524288,namlen=255, \
hard,proto=tcp,timeo=600,retrans=2,sec=sys \
<NFS_SERVER_IP>:/mnt/nfs_share/clickhouse \
/var/lib/clickhouse
sudo mount -t nfs4 \
-o rw,relatime,vers=4.2, \
rsize=524288,wsize=524288,namlen=255, \
hard,proto=tcp,timeo=600,retrans=2,sec=sys \
<NFS_SERVER_IP>:/mnt/nfs_share/nms \
/var/lib/nmsAdd the following lines to /etc/fstab on both servers, replacing <NFS_SERVER_IP> with your actual NFS server’s IP.
<NFS_SERVER_IP>:/mnt/nfs_share/clickhouse /var/lib/clickhouse nfs defaults 0 0
<NFS_SERVER_IP>:/mnt/nfs_share/nms /var/lib/nms nfs defaults 0 0Run these commands to confirm that the NFS mounts are working:
sudo mount -a
df -h
ls -lart /mnt/nfs_share/clickhouse
ls -lart /var/lib/nms
sudo ls -lart /var/lib/clickhouse
telnet <NFS_SERVER_IP> 2049
rpcinfo -p <NFS_SERVER_IP>
sudo showmount -e <NFS_SERVER_IP>
dmesg | grep nfsFailover can be tested by simulating a failure on the active server.
-
Restart
keepalived:sudo systemctl restart keepalived -
Stop a monitored service:
sudo systemctl stop clickhouse-server -
Reboot the active server:
sudo reboot -
Simulate a network failure by disconnecting the active server.
To check if the passive server has taken over, run the following command on the backup server:
ip a | grep <VIRTUAL_IP_ADDRESS>The VIP should now be assigned to the secondary server.
If failover does not work as expected, check the following:
- Ensure
keepalivedis running:systemctl status keepalived - Check logs for errors:
journalctl -u keepalived --no-pager | tail -50 - Verify that NFS mount points are accessible:
df -h - Check the
keepalivedconfiguration for syntax errors:cat /etc/keepalived/keepalived.conf
For additional support, visit the F5 Support Portal.