Monitoring and Tracing

Overview

When you deploy NGINX Service Mesh, it creates a Prometheus server, Zipkin server, and Grafana server for monitoring, tracing, and visualization, respectively.
By default, NGINX Service Mesh deploys with tracing enabled for all services.

The default addresses used for these resources are:

  • Prometheus: prometheus.nginx-mesh.svc.cluster.local:9090
  • Zipkin: zipkin.nginx-mesh.svc.cluster.local:9411
  • Grafana: grafana.nginx-mesh.svc.cluster.local:3000

If using Jaeger instead of Zipkin, the default address is jaeger.nginx-mesh.svc.cluster.local:6831. The Jaeger UI is available on port 16686.

If you already have one or more of these services, you can deploy with an existing Prometheus Deployment, deploy with an existing tracing Deployment, or deploy with an existing Grafana deployment. See the opentracing section for a list of tracing backends that we support.

Use an Existing Prometheus Deployment

To use NGINX Service Mesh with an existing Prometheus Deployment:

Warning:
We do not currently support Prometheus deployments running with TLS encryption.
  1. Connect your existing Prometheus Deployment to the NGINX Service Mesh

    The expected address format is <service-name>.<namespace>:<port>

    • At deployment:

      Run the nginx-meshctl deploy command with the --prometheus-address flag.

      For example:

      nginx-meshctl deploy ...  --prometheus-address "my-prometheus.example-namespace:9090"
      
    • At runtime:

      You can use the NGINX Service Mesh API to update the Prometheus address that the control plane uses to get metrics.

      For example:

      1. Run the kubectl port-forward command to expose the Service.

        kubectl port-forward -n nginx-mesh svc/nginx-mesh-api 8443:443
        
      2. Patch the existing Prometheus address:

        Send the following payload via PATCH to https://localhost:8443/api/config

        {
           "op": "replace",
           "field": {
              "prometheusAddress":"my-prometheus.example-namespace:9090"
           }
        }
        
      Note:
      When updating the Prometheus address at runtime, the NGINX Service Mesh does not remove the existing Prometheus deployment. If you do not plan on using the default Prometheus deployment, we recommend that you delete it to save additional processing. It is located in the namespace of the NGINX Service Mesh control plane.
  2. Add the nginx-mesh-sidecars scrape config to your Prometheus configuration.

    Note:
    If you are deploying NGINX Plus Ingress Controller with the NGINX Service Mesh, add the nginx-plus-ingress scrape config as well. Consult the Metrics section of the NGINX Ingress Controller Deployment tutorial for more information about the metrics collected.

Use an Existing Grafana Deployment

If you prefer to use your own Grafana instance, you can deploy NGINX Service Mesh without Grafana by setting the --deploy-grafana flag to false:

nginx-meshctl deploy ... --deploy-grafana=false

The custom NGINX Service Mesh Grafana dashboard NGINX Mesh Top can be imported into your Grafana instance. For instructions and a list of features, see the Grafana example in the nginx-service-mesh GitHub repo.

OpenTracing

The NGINX Service Mesh supports opentracing. We support Zipkin, Jaeger, and DataDog via the NGINX Opentracing Module.

Use an Existing Tracing Deployment

When you run the nginx-meshctl deploy command, you can specify the address of your existing tracing server.

To deploy NGINX Service Mesh with an existing tracing service:

  1. Add the --tracing-address flag to your deploy command.

    The expected address format is <service-name>.<namespace>:<port>.

    For example:

    nginx-meshctl deploy ... --tracing-backend "zipkin" --tracing-address "my-zipkin-server.example-namespace:9411"
    
    nginx-meshctl deploy ... --tracing-backend "jaeger" --tracing-address "my-jaeger-server.example-namespace:9411"
    

If using DataDog, please view the documentation on how to deploy the DataDog Agent. The agent is required to connect mesh services to your DataDog server. --tracing-address should be set to the address of your DataDog Agent, and --tracing-backend should be set to datadog.

Disable Tracing

To disable tracing for all services, add the --disable-tracing command when deploying NGINX Service Mesh.

nginx-meshctl deploy ... --disable-tracing

Enable/Disable Tracing for Specific Services

Use the config.nsm.nginx.com/tracing-enabled annotation with a value of true or false in your PodSpec to change the tracing value for specific services.

Set the Tracing Sample Rate

You can set the desired sample rate to use for tracing by adding the --sample-rate flag to your deploy command. The sample rate must be a float between 0 and 1. The sample rate sets the probability of sampling a span; this means that a sample rate of 0.1 sets a 10% probability the span is sampled. For example:

nginx-meshctl deploy ... --sample-rate 0.1

bookinfo Example

OpenTracing relies on the trace headers passed through each microservice in an application in order to build a full trace of a request. If you don’t configure your app to pass trace headers, you’ll get disjointed traces that are more difficult to understand.

The bookinfo example application supports trace headers. Take the steps below to visualize tracing:

  1. Follow the steps in the Deploy an Example App tutorial to deploy the bookinfo app.

  2. Reload the product page a few times. This triggers requests through all of the Services.

  3. Port-forward the Zipkin Service:

    kubectl -n nginx-mesh port-forward svc/zipkin 9411
    
  4. Visit http://localhost:9411 in a browser.

  5. Traces should be visible that traverse all services in the app.

Important:
If you are not using the default Zipkin server, you should adjust the port-forwarding (step 3 above) to the appropriate tracing backend and port.