Secure Mesh Traffic using mTLS

Overview

NGINX Service Mesh supports MTLS in GKE using Kubernetes version 1.13.7 or later.

All Kubernetes Resources that use the NGINX Service Mesh sidecar proxy inherit their mTLS settings from the global configuration. You can override the global setting for individual Resources if needed. Refer to Change the mTLS Settings for a Resource for instructions.

Usage

Enable mTLS

When deploying NGINX Service Mesh with mTLS enabled, you can opt to use permissive or strict mode. The default setting for mTLS is permissive.

Caution:
Using permissive mode is not recommended for production deployments.

To enable mTLS, specify the --mtls-mode flag with the desired setting when deploying NGINX Service Mesh. For example:

nginx-meshctl deploy ... --mtls-mode strict

Deploy Using an Upstream Root CA

By default, deployments with mTLS enabled use a self-signed certificate. For testing & evaluation purposes this is acceptable, but for production deployments you should use a proper Public Key Infrastructure (PKI).

SPIRE uses a mechanism called “Upstream Authority” to interface with PKI systems. In order to use an upstream authority, a user must provide the proper configuration and credentials so that SPIRE may interface with the upstream and obtain the pertinent certificates.

In order to use a proper PKI, you must first choose one of the upstream authorities NGINX Service Mesh supports:

  • disk: Requires certificates and private key be on disk.

    • Template: disk.yaml

    • The minimal configuration to successfully deploy the mesh using the disk upstream authority looks like this:

      apiVersion: v1
      upstreamAuthority: disk
      config:
          cert_file_path: /path/to/rootCA.crt
          key_file_path: /path/to/rootCA.key
      
  • aws_pca: Uses Amazon Private certificate authority to manage certificates.

    • Template: aws_pca.yaml

    • Here is the minimal configuration to deploy the mesh using the aws_pca upstream authority:

      apiVersion: "v1"
      upstreamAuthority: "aws_pca"
      config:
          region: "us-west-2"
          certificate_authority_arn: "arn:aws:acm-pca::123456789012:certificate-authority/test"
          aws_access_key_id: "<ACCESS_KEY>"
          aws_secret_access_key: "<YOUR_SECRET_ACCESS_KEY>"
      

For a production deployment, you should provide the following:

  • rootCA.crt - A root CA certificate
  • rootCA.key - A root CA certificate key
  • intermediateCA.crt - An intermediate CA certificate (optional)
  • intermediateCA.key - An intermediate CA certificate key (optional)

For a production deployment, you should use an intermediate CA certificate instead of using the root CA certificate directly. In this case, you would specify the root CA certificate using the appropriate option for the upstream authority:

  • disk: bundle_file_path
  • aws_pca: supplemental_bundle_path

This keeps the root CA key secure because it adds the certificate, not the key itself, to the chain. The upstream bundle may contain multiple intermediate certificates, all the way up to the root CA.

For example, a production deployment using the disk upstream authority will look something like this:

apiVersion: "v1"
upstreamAuthority: "disk"
config:
    cert_file_path: "/path/to/intermediateCA.crt"
    key_file_path: "/path/to/intermediateCA.key"
    bundle_file_path: "/path/to/rootCA.crt"

To deploy using one of these upstream authorities, you must specify the --mtls-upstream-ca-conf flag:

nginx-meshctl deploy ... --mtls-upstream-ca-conf /path/to/upstream_authority.yaml

To find out more about how nginx-meshctl interprets the upstream authority configuration, refer to the Upstream CA Validation JSON schema

Pathlen

x509 certificates have a pathlen field that is used to limit the number of intermediate certificates in between the current certificate and the final endpoint certificate, not including the endpoint certificate.

SPIRE creates a certificate for itself using the intermediate certificate passed in using the arguments defined above, so the pathlen must be either set to 1 or unset. For the root certificate, the pathlen must be at least 2, or unset.

Change the mTLS Setting for a Resource

Important:
If the global mTLS value is set to strict, then the annotation value will be ignored.

To override the global mTLS setting for a specific resource, add an annotation to the Resource definition. For example:

config.nsm.nginx.com/mtls-mode: "strict"

Disable mTLS

To disable mTLS globally, specify the --mtls-mode off flag when deploying NGINX Service Mesh. For example:

nginx-meshctl deploy ... --mtls-mode off

To disable mTLS for a specific Resource, add the following annotation to the Resource definition:

config.nsm.nginx.com/mtls-mode: "off"

Verify Deployment

When using mTLS mode, NGINX Service Mesh deploys additional pods in the nginx-mesh namespace for the SPIRE Server and Agent.

To verify deployment, check whether or not the SPIRE Pods are running. You should have a single Server Pod and an Agent Pod for each Kubernetes Node.

kubectl get pods -n nginx-mesh
NAME                READY   STATUS    RESTARTS   AGE
...
spire-agent-mb9jv   1/1     Running   0          24h
spire-server-0      2/2     Running   0          24h
...

Verify Encryption by Using an Example Service

We’ll use the Istio bookinfo example to test that traffic is, in fact, encrypted with mTLS enabled.

  1. First, deploy the bookinfo application:

    kubectl apply -f bookinfo.yaml
    
  2. To access bookinfo, set up port-forwarding:

    kubectl port-forward svc/productpage 9080
    
  3. Finally, navigate to http://localhost:9080 in a browser. On the front side, it uses clear text. All of the service-to-service calls will be SSL-encrypted.

Debug mTLS Issues

Not all MTLS misconfiguration errors can be caught when the configuration is loaded. For example, NGINX will not detect if the certificate expires during operation. NGINX responds to requests with invalid certificates with a 400 Bad Request error. Debugging information is provided in the error log at the info level.

Refer to logging for information about changing the log level.