Troubleshoot NGINX Controller and the Controller Agent
Steps to take to investigate and fix issues with NGINX Controller and the Controller Agent.
We recommend you upgrade NGINX Controller as new versions become available. Upgrades include new features, feature improvements, or fixes for known issues.
To look up your version of NGINX Controller:
- Open the NGINX Controller browser interface and log in.
- Select the NGINX Controller menu icon, then select Platform.
- On the Platform menu, select Cluster > Overview.
Refer to the NGINX Controller release notes to see what’s new in the latest release of NGINX Controller.
You can create a support package for NGINX Controller that you can use to diagnose issues.
You will need to provide a support package if you open a ticket with NGINX Support via the MyF5 Customer Portal.
/opt/nginx-controller/helper.sh supportpkg [-o|--output <file name>] [-s|--skip-db-dump] [-t|--timeseries-dump <hours>]
||Save the support package file to
||Don’t include the database dump in the support package.|
||Include the last
Take the following steps to create a support package:
Open a secure shell (SSH) connection to the NGINX Controller host and log in as an administrator.
helper.shutility with the
The support package is saved to:
Run the following command on the machine where you want to download the support package to:
scp <username>@<controller-host-ip>:/var/tmp/supportpkg-<timestamp>.tar.gz /local/path
The support package is a tarball that includes NGINX Controller configuration information, logs, and system command output. Sensitive information, including certificate keys, is not included in the support package.
The support package gathers information from the following locations:
. ├── database │ ├── common.dump - full dump of the common database │ ├── common.dump_stderr - any errors when dumping the database │ ├── common-apimgmt-api-client-api-keys.txt - contents of apimgmt_api_client_api_keys table from the common database │ ├── common-apimgmt-api-client-groups.txt - contents of apimgmt_api_client_groups table from the common database │ ├── common-email-verification.txt - contents of email_verification table from the common database │ ├── common-oauth-clients.txt - contents of oauth_clients table from the common database │ ├── common-settings-license.txt - contents of settings_license table from the common database │ ├── common-settings-nginx-plus.txt - contents of settings_nginx_plus table from the common database │ ├── common-table-size.txt - list of all tables and their size in the common database │ ├── data-table-size.txt - list of all tables and their size in the data database │ ├── postgres-database-size.txt - size of every database │ ├── postgres-long-running-queries.txt - all queries running longer than 10 seconds │ ├── system.dump - full dump of the system database │ ├── system-account-limits.txt - contents of account_limits table from the system database │ ├── system-accounts.txt - contents of accounts table from the system database │ ├── system-deleted-accounts.txt - contents of deleted_accounts table from the system database │ ├── system-deleted-users.txt - contents of deleted_users table from the system database │ ├── system-users.txt - contents of users table from the system database │ └── system-table-size.txt - list of all tables and their size in the system database ├── k8s - output of `kubectl cluster-info dump -o yaml` augmented with some extra info │ ├── apiservices.txt - output of `kubectl get apiservice` │ ├── kube-system - contents of the kube-system namespace │ │ ├── coredns-5c98db65d4-6flb9 │ │ │ ├── desc.txt - pod description │ │ │ ├── logs.txt - current logs │ │ │ └── previous-logs.txt - previous logs, if any │ │ ├── ... │ │ ├── daemonsets.yaml - list of daemonsets │ │ ├── deployments.yaml - list of deployments │ │ ├── events.yaml - all events in this namespace │ │ ├── namespace.yaml - details of the namespace, including finalizers │ │ ├── pods.txt - output of `kubectl get pods --show-kind=true -o wide` │ │ ├── pods.yaml - list of all pods │ │ ├── replicasets.yaml - list of replicasets │ │ ├── replication-controllers.yaml - list of replication controllers │ │ ├── resources.txt - all Kubernetes resources in this namespace │ │ └── services.yaml - list of services │ ├── nginx-controller - contents of the nginx-controller namespace │ │ ├── apigw-8fb64f768-9qwcm │ │ │ ├── desc.txt - pod description │ │ │ ├── logs.txt - current logs │ │ │ └── previous-logs.txt - previous logs, if any │ │ ├── ... │ │ ├── daemonsets.yaml - list of daemonsets │ │ ├── deployments.yaml - list of deployments │ │ ├── events.yaml - all events in this namespace │ │ ├── namespace.yaml - details of the namespace, including finalizers │ │ ├── pods.txt - output of `kubectl get pods --show-kind=true -o wide` │ │ ├── pods.yaml - list of all pods │ │ ├── replicasets.yaml - list of replicasets │ │ ├── replication-controllers.yaml - list of replication controllers │ │ ├── resources.txt - all Kubernetes resources in this namespace │ │ ├── services.yaml - list of services │ ├── nodes.txt - output of `kubectl describe nodes` │ ├── nodes.yaml - list of nodes │ ├── resources.txt - all non-namespaced Kubernetes resources (including PersistentVolumes) │ └── version.yaml - Kubernetes version ├── logs - copy of /var/log/nginx-controller/ │ └── nginx-controller-install.log ├── os │ ├── cpuinfo.txt - output of `cat /proc/cpuinfo` │ ├── df-h.txt - output of `df -h` │ ├── df-i.txt - output of `df -i` │ ├── docker-container-ps.txt - output of `docker container ps` │ ├── docker-images.txt - output of `docker images` │ ├── docker-info.txt - output of `docker info` │ ├── docker-stats.txt - output of `docker stats --all --no-stream` │ ├── docker-version.txt - output of `docker version` │ ├── du-mcs.txt - output of `du -mcs /opt/nginx-controller/* /var/log /var/lib` │ ├── env.txt - output of `env` │ ├── firewall-cmd.txt - output of `firewall-cmd --list-all` │ ├── free.txt - output of `free -m` │ ├── hostname-all-fqdns.txt - output of `hostname --all-fqdns` │ ├── hostname-fqdn.txt - output of `hostname --fqdn` │ ├── hostname.txt - output of `hostname` │ ├── hostsfile.txt - output of `cat /etc/hosts` │ ├── ip-address.txt - output of `ip address` │ ├── ip-neigh.txt - output of `ip neigh` │ ├── ip-route.txt - output of `ip route` │ ├── iptables-filter.txt - output of `iptables -L -n -v` │ ├── iptables-mangle.txt - output of `iptables -L -n -v -t mangle` │ ├── iptables-nat.txt - output of `iptables -L -n -v -t nat` │ ├── iptables-save.txt - output of `iptables-save` │ ├── journal-kubelet.txt - output of `journalctl -q -u kubelet --no-pager` │ ├── lspci.txt - output of `lspci -vvv` │ ├── netstat-nr.txt - output of `netstat -nr` │ ├── ps-faux.txt - output of `ps faux` │ ├── pstree.txt - output of `pstree` │ ├── ps.txt - output of `ps aux --sort=-%mem` │ ├── resolvconf.txt - output of `cat /etc/resolv.conf` │ ├── selinux-mode.txt - output of `getenforce` │ ├── ss-ltunp.txt - output of `ss -ltunp` │ ├── swapon.txt - output of `swapon -s` │ ├── sysctl.txt - output of `sysctl -a --ignore` │ ├── systemd.txt - output of `journalctl -q --utc` │ ├── top.txt - output of `top -b -o +%CPU -n 3 -d 1 -w512 -c` │ ├── uname.txt - output of `uname -a` │ ├── uptime.txt - output of `cat /proc/uptime` │ └── vmstat.txt - output of `cat /proc/vmstat` ├── timeseries │ ├── table-sizes.stat - stat table containing controller table sizes │ ├── events.csv - events table dump in csv │ ├── events.sql - events table schema │ ├── metrics_1day.csv - metrics_1day table dump in csv │ ├── metrics_1day.sql - metrics_1day table schema │ ├── metrics_1hour.csv - metrics_1hour table dump in csv │ ├── metrics_1hour.sql - metrics_1hour table schema │ ├── metrics_5min.csv - metrics_5min table dump in csv │ ├── metrics_5min.sql - metrics_5min table schema │ ├── metrics.csv - metrics table dump in csv │ ├── metrics.sql - metrics table schema │ ├── system-asynchronous-metrics.stat - shows info about currently executing events or consuming resources │ ├── system-events.stat - information about the number of events that have occurred in the system │ ├── system-metrics.stat - system metrics │ ├── system-parts.stat - information about parts of a table in the MergeTree family │ ├── system-settings.stat - information about settings that are currently in use │ └── system-tables.stat - information about all the tables └── version.txt - Controller version information
If NGINX Controller isn’t logging WAF Violation Security Events for an App Component that has WAF enabled, take the following steps:
- Check the
agent.confsecurity setting for every Instance referenced by the Gateway(s) associated with the App Component. You’ll need to verify that the Extensions group contains the setting
security = True.
- Restart the NGINX Controller Agent.
To start, stop, and restart the NGINX Controller Agent, run the following commands on the NGINX Plus system where you installed the Agent.
Start the NGINX Controller Agent:
service controller-agent start
Stop the NGINX Controller Agent:
service controller-agent stop
Restart the NGINX Controller Agent:
service controller-agent restart
If you don’t see Signature Names in Security Violation Events, restart the Controller Agent on the dataplane instance.
sudo systemctl restart controller-agent
When deploying an NGINX Plus instance, the deployment may fail because the Controller Agent install script doesn’t download. When this happens, an error similar to the following is logged to
/var/log/agent_install.log: “Failed to download the install script for the agent.”
Take the following steps to troubleshoot the issue:
- Ensure that ports 443 and 8443 are open between NGINX Controller and the network where the NGINX Plus instance is being deployed.
- Verify that you can communicate with NGINX Controller from the NGINX Plus instance using the NGINX Controller FQDN that you provided when you installed NGINX Controller.
- If you’re deploying an NGINX Plus instance on Amazon Web Services using a template, ensure that the Amazon Machine Image (AMI) referenced in the
instance_templatehas a cURL version of 7.32 or newer.
If the system asks you to provide a password when you’re installing the Controller Agent, the cause may be that you are starting the install script from a non-root account. If so, you need sudo rights. Depending on your system configuration, sudo may ask you for a password if you’re using a non-root account.
After you install and start the Controller Agent, it should begin reporting right away, pushing aggregated data to NGINX Controller at regular one-minute intervals. It takes about one minute for a new Instance to appear in the NGINX Controller user interface.
If you don’t see the new Instance in the user interface or the Controller Agent isn’t collecting metrics, make sure of the following:
The Controller Agent package –
nginx-controller-agent– installed successfully without any warnings.
The controller-agent service is running and updating its log file. To check the status, run the following command:
systemctl status controller-agentSee Also:
For troubleshooting purposes, you can turn on Controller Agent debug logging by editing the
agent.conffile. For more information, refer to K64001240: Enabling NGINX Controller Agent debug logging.
The system DNS resolver is correctly configured, and the NGINX Controller server’s fully qualified domain name (FQDN) can be resolved.
The controller-agent service can be running as
rootor a different user, chosen during the installation if the Controller Agent was installed to run as a non-root user. To view the user ID for the controller-agent service, run the following command:
ps -ef | egrep 'agent'
The output looks similar to the following (with a different user for non-root Agent installations):
root 19132 1 1 Sep03 ? 00:23:45 /usr/bin/nginx-controller-agent
The Controller Agent and the NGINX Instance user IDs can both run the
pscommand to see all the system processes. If the
pscommand is restricted for non-privileged users, the Controller Agent won’t detect the NGINX master process.
The system time is set correctly. If the time on the system where the Controller Agent is running is ahead or behind the NGINX Controller’s system time, you won’t be able to see data in graphs. Make sure that NGINX Controller and any NGINX Instances have their time synchronized using NTP.
The NGINX Plus API is set up correctly and working.
Refer to the Configuring the API section of the NGINX Plus Admin Guide for instructions.
All NGINX configuration files are readable by the Controller Agent user ID. Verify the owner, group, and permissions settings.
apparmor, or other third-party OS security tools are not interfering with the metrics collection. For example, for
setenforce 0temporarily to see if it improves the situation for certain metrics.
The virtual private server (VPS) provider has not used hardened Linux kernels that may restrict non-root users from accessing
/sys. Metrics describing the system and NGINX disk I/O are usually affected. There is no easy workaround for this except to allow the Controller Agent to run as
root. Fixing permissions for
/sys/blockmay also help.
For more information on installing and configuring the Controller Agent, see the following topics:
If NGINX Controller appears to be unlicensed after a version upgrade, try the following options to resolve the issue.
Certain content-filtering and ad-blocking web browser extensions may incorrectly block the elements on the NGINX Controller Analytics events page. As a result, when you access the Analytics > Events page using the NGINX Controller user interface, you may observe messages indicating missing events. Refer to the AskF5 KB article K48603454 to learn more about this issue and how to resolve it.