NGINX Documentation

Troubleshooting Guide

App Protect Troubleshooting Overview

If there are any problems collect the troubleshooting information in a tarball and send it to your customer support engineer.

  1. Tarball preparation to collect data for troubleshooting:

    • Get all versions via: cat /opt/app_protect/VERSION > rpm_versions.txt && rpm -qa nginx-plus* app-protect* >> rpm_versions.txt
    • Get OS via: cat /etc/os-release > system_version.txt && uname -r >> system_version.txt && cat /proc/version >> system_version.txt
  2. Create a list of files for tarball in a file called logs.txt: rpm_versions.txt system_version.txt /var/log/app_protect/* (all app protect files) /var/log/nginx/* (all NGINX files)

  3. Add all policies and log file configuration

  4. Add all nginx configuration including all references such as /etc/nginx/nginx.conf

  5. Create the tarball:

    tar cvfz logs.tgz `cat logs.tx` 
    
  6. Send logs.tgz to support.

App Protect Logging Overview

There are 3 types of logs that App Protect on NGINX generates:

  • Security log or Request log: The HTTP requests and how App Protect processed them, including violations and signatures found.
  • Operation log: Events such as startup, shutdown and reconfiguration.
  • Debug logs: technical messages at different levels of severity used to debug and resolve incidents and error behaviors.

Note that NGINX does not have audit logs in the sense of who did what. This can be done either from the orchestration system controlling NGINX (such as NGINX Controller) or by tracking the configuration files and the systemd invocations using Linux tools.

App Protect uses its own logging mechanism for request logging rather than NGINX’s access logging mechanism (which is NGINX’s default logging mechanism).

Type Log Configuration Configuration contexts File Destination Syslog Destination
Security app_protect_security_log directive referencing security_log.json file nginx.conf: http, server, location Yes, either stderr, or an absolute path to a local file are supported Yes
Operation error_log directive, part of core NGINX nginx.conf - global Yes, NGINX error log Yes, NGINX error log
Debug /etc/app_protect/bd/logger.cfg. Log file name is the redirection in the invocation of the bd command line in the start script Global (not part of nginx.conf) Yes. Log file is in /var/log/app_protect default debug directory. No file rotation currently No

Operation Logs

Overview

The operation logs consists of system operational and health events. The events are sent to the NGINX error log and are distinguished by the APP_PROTECT prefix followed by JSON body. The log level depends on the event: success is usually Notice while failure is Error. The timestamp is inherent in the error log.

Events

Event Type Level Meaning
App Protect Connected Notice A worker successfully connected to NGINX App Protect Enforcer. The mode attribute should be operational unless there is an ongoing problem.
{
    "event": "waf_connected",
    "bd_thread_id": 3,
    "worker_pid": 4928,
    "mode": "operational",
    "mode_changed": true
}
Event Type Level Meaning
App Protect Connection Failure Error A worker attempted to connect to NGINX App Protect but the operation failed. The mode should be failure.
{
    "event": "waf_connection_failure",
    "bd_thread_id": 3,
    "worker_pid": 4928,
    "mode": "failure",
    "mode_changed": true
}

|

Event Type Level Meaning
App Protect Disconnected Error Engine disconnected from Worker (socket closed). The mode should be failure.
{
    "event": "waf_disconnected",
    "bd_thread_id": 3,
    "worker_pid": 4928,
    "mode": "failure",
    "mode_changed": true
}
Event Type Level Meaning
App Protect Resource Exception Warning Resource, as measured by the Worker, exceeded limits (above high threshold). Mode should be failure. It may have already been in this mode because there are other resources that had exceeded their limits.
{
    "event": "waf_resource_exception",
    "bd_thread_id": 3,
    "worker_pid": 4928,
    "mode": "failure",
    "mode_changed": true,
    "resource": "cpu",
    "value": 98,
    "threshold": 95
}
Event Type Level Meaning
App Protect Resource Reverted to Normal Warning Resource, as measured by the Worker, went back to normal range (below low threshold). Mode should be operational, unless there are other resources which are still out of limits.
{
    "event": "waf_resource_revert",
    "bd_thread_id": 3,
    "worker_pid": 4928,
    "mode": "operational",
    "mode_changed": true,
    "resource": "cpu",
    "value": 88,
    "threshold": 90
}
Event Type Level Meaning
Configuration Error Error There were errors in the AppProtect directives in the nginx.conf file. This is issued if the directive was spelled correctly, otherwise NGINX core will issue an error. This event occurs before configuration_load_start and means there will be no configuration load. This event is generated only on configuration reload. It cannot be generated on first configuration as there is no error log configured yet.
{
    "event": "configuration_error",
    "error_message": "unknown argument",
    "line_number": 58
}
Event Type Level Meaning
Configuration Load Start Notice App Protect configuration load process started. The configuration consists of all the policies, security log configurations and global settings. These all are part of the config set file generated by the module and passed to the Policy Compiler. The path to this file in included in the event message. This event is generated only on configuration reload. It cannot be generated on first configuration as there is no error log configured yet.
{
    "event": "configuration_load_start",
    "configSetFile": "/opt/app_protect/share/config_set.json"
}
Event Type Level Meaning
Configuration Load Failure Error There was an error in one of the configuration files: file not found, failed to compile, or the configuration failed to load to the engine.
{
    "event": configuration_load_failure",
    "error_message": "Failed to import Policy '/etc/nginx/default_policy.json' from '/etc/nginx/default_policy.json': Fail parse JSON Policy: malformed JSON string, neither tag, array, object, number, string or atom, at character offset 0 (before \"xxxx\\nhdjk\\n\\n555\\n\") \n.\n",
    "error_line_number": 58
}
Event Type Level Meaning
Configuration Load Success Notice The WAF configuration process ended successfully: all policies, log configuration and global settings were loaded to NGINX App Protect and all traffic will be handled by this configuration. The “error_message” contains warnings. This event is generated also on the initial configuration (when NGINX starts). Also includes the signature update version which reflects the date the package was released and the exact revision time in datetime format that also includes the time of day, thus compatible with the revision date time in the WAF policy signature-requirements element.
{
    "event": "configuration_load_success",
    "error_message": "csrfProtection/enabled cannot be set to true and set to false",
    "software_version": "1.3.42",
    "attack_signatures_package": {
        "version": "2020.02.20",
        "revision_datetime": "2020-02-20T13:35:38"
    }
}

Access Logs

Access log is NGINX’s request log mechanism. It is controlled by two directives.

log_format

This directive determines the format of the log messages using predefined variables. App Protect will enrich this set of variables with several security log attributes that are available to be included in the log_format. If log_format is not specified then the built-in format combined is used but, because that format does not include the extended App Protect variables, this directive must be used when the user wants to add App Protect information to the log.

access_log

This directive determines the destination of the access_log and the name of the format. The default is the file /etc/nginx/log/access.log using the combined format. In order to use the custom format that includes the NAP variables, use this directive with the name of the desired format.

App Protect Variables for Access Log

These are the variables added to Access Log. They are a subset of the Security log attributes. The Security log names are prefixed with $app_protect.

Name Meaning Comment
$app_protect_support_id Unique ID assigned to the request by App Protect. To be used to correlate the access log with the security log. Left empty in failure mode.
$app_protect_outcome
One of:
  • PASSED: request was sent to origin server.
  • REJECTED: request was blocked.
 
$app_protect_outcome_reason
One of:
  • SECURITY_WAF_OK: allowed with no violations (legal request).
  • SECURITY_WAF_VIOLATION: blocked due to security violations.
  • SECURITY_WAF_FLAGGED: allowed although it has violations (illegal).
  • SECURITY_WAF_BYPASS: WAF was supposed to inspect the request but it didn’t (because of unavailability or resource shortage). The request was PASSED or REJECTED according to the failure mode action determined by the user.
  • SECURITY_WAF_REQUEST_IN_FILE_BYPASS: WAF was supposed to inspect the request but it didn’t (because request buffer was full and request was written to file). The request was PASSED or REJECTED according to the failure mode action determined by the user.
  • SECURITY_WAF_COMPRESSED_REQUEST_BYPASS: WAF was supposed to inspect the request but it didn’t (because request was compressed). The request was PASSED or REJECTED according to the failure mode action determined by the user.
 
$app_protect_policy_name The name of the policy that enforced the request.  
$app_protect_version The App Protect version string: major.minor.build format. Does not include the NGINX plus version (e.g. R21). The latter is available in $version variable.

Note that many of the other Security log attributes that are not included here have exact or similar parallels among the NGINX variables also available for access log. For example, $request is parallel to the request security log attribute. See the full list of NGINX variables.

Example

http {
    log_format security_waf 'request_time=$request_time client_ip=$remote_addr,'
                             'request="$request", status=$status,'
                             'waf_policy=$app_protect_policy_name, waf_request_id=$app_protect_support_id'
                             'waf_action=$app_protect_outcome, waf_action_reason=$app_protect_outcome_reason';
 
    server {
 
        location / {
            access_log /etc/app_protect/logs/nginx-access.log security_waf;
            ...
        }
    }
}

Debug Logs

Debug log settings determine the minimum log level and the internal App Protect components included in the log. We include a perl script that allows modification of those parameters without having to restart App Protect or reload NGINX.

nginx.conf does not refer to the NGINX App Protect debug log configuration neither directly nor indirectly.

Logger Configuration File

The logging configuration file is located in: /etc/app_protect/bd/logger.cfg and contains the App Protect modules for logging and debugging.

################################################################################################
#
#                                        Logger configuration file
#
#       Existing modules:
#
#       IO_PLUGIN (Requests & Responses) FTP_PLUGIN (ftp) SMTP_PLUGIN (smtp)
#       BEM (Accumulation Responses), ECARD (Tables),
#       ECARD_POLICY (Enforcer), BD_SSL (Communications), UMU (Memory),
#       IMF (Sockets), BD_MISC (Config and miscs),COOKIE_MGR (Cookies), REG_EXP (Regular expressions),
#       RESP_PARAMS (Extractions), ATTACK_SIG (Attack Signatures), BD_XML(XML Enforcer),
#       ATTACK_ENGINE (BF & BOT detect monitor), XML_PARSER (all xml engine), ACY (pattern match engine),
#       BD_PB (policy builder), BD_PB_SAMPLING (sampling decisions for pb), LEGAL_HASH (internal cache tables),
#       CLIENT_SIDE (Client Side infrastructure), STATS (policy builder statistics), ICAP (content inspection),
#       CLUSTER_ANOMALY (the anomaly distributed channel), PIPE (shmem channel bd-pbng, bd-lrn),
#       MPP_PARSER (Multipart parser), SA_PLUGIN (Session awareness), DATA_PROTECT (Data Protection Library),
#       GDM (Guardium DB security), ASM_IRULE (ASM iRule commands), LIBDATASYNC (Data Sync Library),
#       BD_CONF (BD MCP configuration), MPI_CHANNEL (BD initiated MPI events),
#       BD_FLUSH_TBLS(flush BD conf tables), CSRF (CSRF feature), BRUTE_FORCE_ENFORCER (Brute Force feature),
#       LONG_REQUEST (Long request), HTML_PARSER (HTML parser)
#
#       Log levels:
#
#       TS_DEBUG | TS_INFO | TS_NOTICE | TS_WARNING | TS_ERR | TS_CRIT
#
#       File numbers:
#
#       errors.log = 2 , debug.log = 3 (see /ts/agent/log.cfg)
#

Enabling Debug Logging

To add a module for logging:

#       MODULE = <module_name>;
#       LOG_LEVEL = <log level 1> | <log level 2> | ... | <log level n>;
#       FILE = <file number> (recommended 2 always);
#
#       Use # to comment out lines.

For example:

MODULE=IO_PLUGIN;
LOG_LEVEL=TS_INFO | TS_DEBUG;
FILE = 2;
MODULE = ALL;
LOG_LEVEL = TS_ERR | TS_CRIT | TS_WARNING | TS_NOTICE;
FILE=2;

Then run the perl script in the App Protect bin folder to begin log collections.

/bin/su -s /bin/bash -c '/opt/app_protect/bin/set_active.pl -g' nginx

ELK issues

ELK issues are addressed directly in GitHub by posting the issue to Kibana dashboards for F5 App Protect WAF GitHub repo.

SELinux

App Protect files and processes are labeled with the following two contexts:

  • nap-compiler_t
  • nap-engine_t

NGINX Plus is labeled with the httpd_t context.

If you run into a situation where SELinux denies access to something, start the troubleshooting by searching for audit denials related to one of the above contexts.

For example:

ausearch --start recent -m avc --raw -se nap-engine_t
--start recent here means to start the search from 10 minutes ago

For more information about how to use NGINX Plus with SELinux - check our blog