Set Up Data Plane High Availability

Overview

This topic explains how to configure a high-availability data plane for your apps in on-premises deployments using NGINX Controller, NGINX Plus, and keepalived. High-availability data planes help to ensure your apps operate continuously without service interruptions.

Support for High Availability (HA) mode is limited to two NGINX Plus instances. You can set up data plane HA in environments where IP addresses can be controlled through standard operating system calls.

Implementation Considerations

  • Data plane HA does not support Zone Synchronization.
  • Data plane HA is supported in public clouds such as Amazon Web Services, Microsoft Azure, or Google Cloud Platform.

Prepare the High-Availability Instances

Install NGINX Keepalived

  1. NGINX distributes keepalived and some utilities in the nginx-ha-keepalived package. Follow the instructions in Configuring High Availability in the NGINX Plus admin guide to install this version of NGINX keepalived.

    Caution:
    Other versions or packages of keepalived have not been tested and functionality is not guaranteed.
  2. To ensure that keepalived starts automatically when the instance restarts, you must enable the service:

    sudo systemctl enable keepalived.service
    

Configure the Data Plane Instances

After you’ve installed keepalived, there are a few manual steps to take to ready each data plane instance:

  1. Set up support for non-local IP address bindings:

    echo "net.ipv4.ip_nonlocal_bind=1" | sudo tee -a /etc/sysctl.conf
    sudo sysctl -p
    
  2. Disable the default web server in the default.conf configuration file:

    sudo mv /etc/nginx/conf.d/default.conf /etc/nginx/conf.d/default.conf.bak
    sudo systemctl restart nginx.service
    
  3. Open the the keepalived.conf for editing and add the markers #NGINX_CONTROLLER_HA_BEGIN and #NGINX_CONTROLLER_HA_END to the virtual_ipaddress configuration section, then reload the configuration. This is required to avoid conflicts with the configurations that NGINX Controller Agent applies to keepalived.

    Note:
    Any IP addresses that are manually added between these markers will be removed during a Gateway configuration operation.
    Note:
    Managing the keepalived.conf file through third-party configuration management software is not supported.

    The following is a sample Ubuntu configuration:

    global_defs {
            vrrp_version 3
    }
    
    vrrp_script chk_manual_failover {
            script "/usr/lib/keepalived/nginx-ha-manual-failover"
            interval 10
            weight 50
    }
    
    vrrp_script chk_nginx_service {
            script "/usr/lib/keepalived/nginx-ha-check"
            interval 3
            weight 50
    }
    
    vrrp_instance VI_1 {
            interface ens32
            priority 101
            virtual_router_id 51
            advert_int 1
            accept
            garp_master_refresh 5
            garp_master_refresh_repeat 1
            unicast_src_ip 192.168.100.100
            unicast_peer {
                    192.168.100.101
            }
            virtual_ipaddress {
    #NGINX_CONTROLLER_HA_BEGIN
    #NGINX_CONTROLLER_HA_END
            }
            track_script {
                    chk_nginx_service
                    chk_manual_failover
            }
            notify "/usr/lib/keepalived/nginx-ha-notify"
    }
    
  4. (Optional) To test that the keepalived service is up:

    1. Add an IP address between the #NGINX_CONTROLLER_HA_BEGIN and #NGINX_CONTROLLER_HA_END sections, then start keepalived:

      sudo systemctl start keepalived
      
    2. Check that the IP address is configured on the primary node that you designated when setting up keepalived.

    3. On the backup node, check that the keepalived service is running and that the test IP is assigned:

      sudo systemctl status keepalived
      ip addr show dev <device in keepalived>
      
    4. After you’ve verified that keepalived works as expected, stop the service. The NGINX Controller Agent will modify the virtual_ipaddress contents and start the service when a Gateway configuration is pushed.

      sudo systemctl stop keepalived
      

Create a High-Availability Gateway

Follow the instructions to Create a Gateway.

  1. Open the NGINX Controller user interface and log in.

  2. Select the NGINX Controller menu icon, then select Services > Gateways.

  3. Select Create Gateway.

  4. Complete each of the configuration sections:

  5. When ready, review the API Spec and then select Submit to create the Gateway.

In particular, on the Gateways > Create Gateways > Placements page, take the steps below:

  1. In the Instance Refs box, select the NGINX instance(s) that you want to deploy the Gateway on.

  2. In the Listen IPs box, add the IP address(es) on which the server listens for and accepts requests.

    • To use non-local Listen IPs, you must enable net.ipv4.ip_nonlocal_bind on the instance.
    • When High Availability Mode is enabled, Virtual Router Redundancy Protocol (VRRP) is configured for the Listen IP address(es).
  3. To enable high-availability mode for your data paths, select Use High Availability Mode.

Performing Maintenance on High-Availability Pairs

Caution:
Configuration pushes made during the maintenance window will fail.

To perform maintenance updates on the high-availability pair, take the following steps:

  1. Determine which instance is the primary and which is the backup node. A node’s current state is recorded in the local /var/run/nginx-ha-keepalived.state file. You can use the cat command to display it:

    node-1 $ cat /var/run/nginx-ha-keepalived.state
    STATE=MASTER
    node-2 $ cat /var/run/nginx-ha-keepalived.state
    STATE=BACKUP
    

    In the example output, node-2 is the backup node.

  2. Stop keepalived on the backup node:

    sudo systemctl stop keepalived
    
  3. Perform any maintenance or updates to the backup node.

  4. Bring the backup node back online and ensure that keepalived is running:

    sudo systemctl start keepalived
    sudo systemctl status keepalived
    
  5. Stop keepalived on the primary node:

    sudo systemctl stop keepalived
    
  6. Test that the application still functions properly.

    If you notice any problems, re-enable the primary node and check that NGINX Plus and keepalived are running on the backup node:

    sudo systemctl status nginx
    sudo systemctl status keepalived
    
  7. Perform any maintenance or updates on the primary node.

  8. Bring the primary node back online.

  9. Check the state of keepalived on the primary node:

    journalctl -u keepalived.service -f
    ip addr show
    

    The output should show the state transition:

    Nov 23 12:44:52 testenv-d206eaca-data-1 Keepalived_vrrp[798]: (VI_1) Entering MASTER STATE                                 │Nov 23 12:36:53 testenv-d206eaca-data-2 Keepalived_vrrp[1098]: WARNING - default user 'keepalived_script' for script execu
    Nov 23 12:44:52 testenv-d206eaca-data-1 Keepalived_vrrp[798]: (VI_1) using locally configured advertisement interval (1000 │tion does not exist - please create.
    milli-sec)                                                                                                                 │Nov 23 12:36:53 testenv-d206eaca-data-2 Keepalived_vrrp[1098]: SECURITY VIOLATION - scripts are being executed but script_
    Nov 23 12:44:52 testenv-d206eaca-data-1 nginx-ha-keepalived[1338]: Transition to state 'MASTER' on VRRP instance 'VI_1'.
    

This documentation applies to the following versions of NGINX Controller Documentation: 3.12, 3.13, 3.14, 3.15 and 3.16.