Monitoring and Tracing

Learn about the monitoring and tracing features available in NGINX Service Mesh.

Overview

NGINX Service Mesh can integrate with your Prometheus, Grafana, and tracing backends for exporting metrics and tracing data.

See Also:
If you do not have your own Prometheus, Grafana, or tracing backends deployed, the Observability Tutorial will deploy and integrate a basic demo setup for you.

Prometheus

Important:

In order to prevent automatic sidecar injection into your Prometheus deployment, it should be deployed in a namespace where auto-injection is disabled, or the injector.nsm.nginx.com/auto-inject: "false" annotation must be added to the PodTemplateSpec of the Prometheus deployment.

Refer to NGINX Service Mesh Annotations for more information.

To use NGINX Service Mesh with your Prometheus deployment:

Warning:
We do not currently support Prometheus deployments running with TLS encryption.
  1. Connect your existing Prometheus Deployment to NGINX Service Mesh:

    The expected address format is <service-name>.<namespace>:<port>

    • At deployment:

      Run the nginx-meshctl deploy command with the --prometheus-address flag.

      For example:

      nginx-meshctl deploy ...  --prometheus-address "my-prometheus.example-namespace:9090"
      
    • At runtime:

      You can use the NGINX Service Mesh API to update the Prometheus address that the control plane uses to get metrics.

      For example, send the following payload via PATCH to the NGINX Service Mesh API:

      {
         "op": "replace",
         "field": {
            "prometheusAddress": "my-prometheus.example-namespace:9090"
         }
      }
      
  2. Add the nginx-mesh-sidecars scrape config to your Prometheus configuration.

    Note:
    If you are deploying NGINX Plus Ingress Controller with the NGINX Service Mesh, add the nginx-plus-ingress scrape config as well. Consult the Metrics section of the NGINX Ingress Controller Deployment tutorial for more information about the metrics collected.
See Also:
For more information on how to view and understand the metrics that we track, see our Prometheus Metrics guide.

Grafana

The custom NGINX Service Mesh Grafana dashboard NGINX Mesh Top can be imported into your Grafana instance. For instructions and a list of features, see the Grafana example in the nginx-service-mesh GitHub repo.

Important:

In order to prevent automatic sidecar injection into your Grafana deployment, it should be deployed in a namespace where auto-injection is disabled, or the injector.nsm.nginx.com/auto-inject: "false" annotation must be added to the PodTemplateSpec of the Grafana deployment.

Refer to NGINX Service Mesh Annotations for more information.

Tracing

NGINX Service Mesh can export tracing data using either OpenTelemetry or OpenTracing.

OpenTelemetry

NGINX Service Mesh can export tracing data using OpenTelemetry. Tracing data can be exported to an OpenTelemetry Protocol (OTLP) gRPC Exporter, such as the OpenTelemetry (OTEL) Collector, which can then export data to one or more upstream collectors like Jaeger, DataDog, LightStep, or many others. Before installing the mesh, deploy an OTEL Collector that is configured to export data to an upstream collector that you have already deployed or have access to.

Tracing relies on the trace headers passed through each microservice in an application in order to build a full trace of a request. If you don’t configure your app to pass trace headers, you’ll get disjointed traces that are more difficult to understand. See the OpenTelemetry Instrumentation guide for information on how to instrument your application to pass trace headers.

Important:

In order to prevent automatic sidecar injection into your tracing deployments, it should be deployed in a namespace where auto-injection is disabled, or the injector.nsm.nginx.com/auto-inject: "false" annotation must be added to the PodTemplateSpec of the deployments.

Refer to NGINX Service Mesh Annotations for more information.

Note:
OpenTracing and OpenTelemetry cannot be enabled at the same time.
  • At deployment:

    Use the --telemetry-exporters flag to point the mesh to your OTLP exporter:

    nginx-meshctl deploy ... --telemetry-exporters "type=otlp,host=otel-collector.example-namespace.svc,port=4317"
    

    You can set the desired sampler ratio to use for tracing by adding the --telemetry-sampler-ratio flag to your deploy command. The sampler ratio must be a float between 0 and 1. The sampler ratio sets the probability of sampling a span; this means that a sampler ratio of 0.1 sets a 10% probability the span is sampled. For example:

    nginx-meshctl deploy ... --telemetry-sampler-ratio 0.1
    
  • At runtime:

    You can use the NGINX Service Mesh API to update the telemetry configuration.

    For example, send the following payload via PATCH to the NGINX Service Mesh API:

    {
       "op": "replace",
       "field": {
          "telemetry": {
             "exporters": {
                "otlp": {
                   "host": "otel-collector.example-namespace.svc",
                   "port": 4317
                }
             },
             "samplerRatio": 0.1
          }
       }
    }
    
    

If configured correctly, tracing data that is generated or propagated by the NGINX Service Mesh sidecar will be exported to the OTEL Collector, and then exported to the upstream collector(s), as shown in the following example diagram:

OpenTelemetry Data Flow
Tracing data flow using the OpenTelemetry Collector

OpenTracing

Note:
OpenTracing is deprecated in favor of OpenTelemetry.

NGINX Service Mesh supports OpenTracing. We support Zipkin, Jaeger, and DataDog via the NGINX Opentracing Module.

Tracing relies on the trace headers passed through each microservice in an application in order to build a full trace of a request. If you don’t configure your app to pass trace headers, you’ll get disjointed traces that are more difficult to understand. See the OpenTracing Language Support guide for information on how to instrument your application to pass trace headers depending on your desired language.

Important:

In order to prevent automatic sidecar injection into your tracing deployment, it should be deployed in a namespace where auto-injection is disabled, or the injector.nsm.nginx.com/auto-inject: "false" annotation must be added to the PodTemplateSpec of the deployment.

Refer to NGINX Service Mesh Annotations for more information.

Important:
OpenTracing and OpenTelemetry cannot be enabled at the same time.
  • At deployment:

    Add the --tracing-address and --tracing-backend flags to your nginx-meshctl deploy command.

    The expected address format is <service-name>.<namespace>:<port>.

    For example:

    nginx-meshctl deploy ... --tracing-backend "zipkin" --tracing-address "my-zipkin-server.example-namespace:9411"
    
    nginx-meshctl deploy ... --tracing-backend "jaeger" --tracing-address "my-jaeger-server.example-namespace:6831"
    

    You can set the desired sample rate to use for tracing by adding the --sample-rate flag to your deploy command. The sample rate must be a float between 0 and 1. The sample rate sets the probability of sampling a span; this means that a sample rate of 0.1 sets a 10% probability the span is sampled. For example:

    nginx-meshctl deploy ... --sample-rate 0.1
    
  • At runtime:

    You can use the NGINX Service Mesh API to update the tracing configuration.

    For example, send the following payload via PATCH to the NGINX Service Mesh API:

    {
       "op": "replace",
       "field": {
          "tracing": {
             "backend": "jaeger",
             "backendAddress": "my-jaeger-server.example-namespace:6831",
             "sampleRate": 0.1
          }
       }
    }
    
    

If using DataDog, refer to the DataDog Agent documentation for deployment instructions. The DataDog Agent is required to connect mesh services to your DataDog server. You should set the --tracing-address to the address of your DataDog Agent and --tracing-backend to datadog.