Technical Specifications

Cluster requirements and NGINX Service Mesh footprint

The following document outlines the software versions and overhead NGINX Service Mesh uses while running.

Software Versions

The following table lists the software versions NGINX Service Mesh uses by default.

NGINX Service Mesh NGINX Plus (sidecar) SPIRE NATS Prometheus NGINX OpenTracing OpenTracing C++ OpenTracing Zipkin C++ OpenTracing Jaeger Client C++ OpenTracing Datadog C++ Client Grafana * Jaeger (Default) * Zipkin *
v1.2.1 R24 P1 1.0.2 nats:2.4.0-alpine3.14 prom/prometheus:v2.20.1 0.10.0 1.5.1 0.5.2 0.4.2 1.2.0 grafana/grafana:8.1.7 jaegertracing/all-in-one:1.26.0 openzipkin/zipkin:2.21
v1.2.0 R24 P1 1.0.2 nats:2.4.0-alpine3.14 prom/prometheus:v2.20.1 0.10.0 1.5.1 0.5.2 0.4.2 1.2.0 grafana/grafana:8.1.3 jaegertracing/all-in-one:1.26.0 openzipkin/zipkin:2.21
v1.1.0 R24 P1 0.12.3 nats:2.1.8-alpine3.11 prom/prometheus:v2.20.1 0.10.0 1.5.1 0.5.2 0.4.2 1.2.0 grafana/grafana:7.5.3 jaegertracing/all-in-one:1.19.2 openzipkin/zipkin:2.21
v1.0.1 R23 P1 0.12.1 nats:2.1.8-alpine3.11 prom/prometheus:v2.20.1 0.10.0 1.5.1 0.5.2 0.4.2 1.2.0 grafana/grafana:7.5.3 jaegertracing/all-in-one:1.19.2 openzipkin/zipkin:2.21
v1.0.0 R23 0.12.1 nats:2.1.8-alpine3.11 prom/prometheus:v2.20.1 0.10.0 1.5.1 0.5.2 0.4.2 1.2.0 grafana/grafana:7.5.3 jaegertracing/all-in-one:1.19.2 openzipkin/zipkin:2.21

* - Software not required by NGINX Service Mesh. See Monitoring and Tracing for details on disabling.

Supported Kubernetes Versions

NGINX Service Mesh supports Kubernetes versions 1.16-1.21.

Supported OpenShift Versions

NGINX Service Mesh supports OpenShift version 4.8.

A series of automated tests are frequently run to ensure mesh stability and reliability. For deployments less than 100 Pods, a minimum cluster environment is recommended:

Environment Machine Type Number of Nodes
GKE n2-standard-4 (4 vCPU, 16GB) 3
AKS Standard_D4s_v3 (4 vCPU, 16GiB) 3
EKS t3.xlarge (4 vCPU, 16GiB) 3
AWS t3.xlarge (4 vCPU, 16GiB) 1 Control, 3 Workers

Overhead

The overhead of NGINX Service Mesh varies depending on the component in the mesh and the type of resources currently deployed. The control plane is responsible for holding the state of all managed resources. Therefore, it scales up linearly with the number of resources being handled - be it Pods, Services, TrafficSplits, or any other resource in NGINX Service Mesh. Spire specifically watches for new workloads, which reside 1:1 in every Pod deployed. As a result, it scales up as more Pods are added to the mesh.

The data plane sidecar must keep track of the other Services in the mesh as well as any traffic policies that are associated with it. Therefore, the resource load will increase as a function of the number of Services and traffic policies in the mesh. In an attempt to balance the stress on the cluster, we run a nightly test which flexes the most critical components of the mesh. Below are the details of this test, so you may get an idea of the overhead each component is responsible for and size your own cluster accordingly.

Stress Test Overhead

Cluster Information:

  • Environment: GKE
  • Node Type: n2-standard-4 (4 vCPU, 16GB)
  • Number of nodes: 3
  • Kubernetes Version: 1.18.16

Metrics were gathered using the Kubernetes Metrics API. CPU is calculated in terms of the number of cpu units, where one cpu is equivalent to 1 vCPU/Core. For more information on the metrics API and how the data is recorded, see The Metrics API documentation.

CPU

Num Services Control Plane (without metrics and tracing) Control Plane Total Average Sidecar
10 (20 Pods) CPU: 0.075 vCPU CPU: 0.095 vCPU CPU: 0.033 vCPU
50 (100 Pods) CPU: 0.097 vCPU CPU: 0.431 vCPU CPU: 0.075 vCPU
100 (200 Pods) CPU: 0.148 vCPU CPU: 0.233 vCPU CPU: 0.050 vCPU

Memory

Num Services Control Plane (without metrics and tracing) Control Plane Total Average Sidecar
10 (20 Pods) Memory: 168.766 MiB Memory: 767.500 MiB Memory: 33.380 MiB
50 (100 Pods) Memory: 215.289 MiB Memory: 2347.258 MiB Memory: 38.542 MiB
100 (200 Pods) Memory: 272.305 MiB Memory: 4973.992 MiB Memory: 52.946 MiB

Disk Usage

Spire uses a persistent volume to make restarts more seamless. NGINX Service Mesh automatically allocates 1 GB persistent volume in supported environments (see Persistent Storage setup page for environment requirements). Below is the information on the disk usage within that volume. Disk usage scales directly with the number of Pods in the mesh.

Num Pods Disk Usage
20 4.2 MB
100 4.3 MB
200 4.6 MB

Ports

The following table lists the ports and IP addresses the NGINX Service Mesh sidecar binds.

Port IP Address Protocol Direction Purpose
8900 0.0.0.0 All Outgoing Redirect to virtual server for traffic type 1
8901 0.0.0.0 All Incoming Redirect to virtual server for traffic type 1
8902 localhost All Outgoing Redirection error
8903 localhost All Incoming Redirection error
8904 localhost TCP Incoming Main virtual server
8905 localhost TCP Incoming TCP traffic denied by Access Control policies
8906 localhost TCP Outgoing Main virtual server
8907 localhost TCP Incoming Permissive virtual server 2
8886 0.0.0.0 HTTP Control NGINX Plus API
8887 0.0.0.0 HTTP Control Prometheus metrics
8888 localhost HTTP Incoming Main virtual server
8889 localhost HTTP Outgoing Main virtual server
8890 localhost HTTP Incoming Permissive virtual server 2
8891 localhost GRPC Incoming Main virtual server
8892 localhost GRPC Outgoing Main virtual server
8893 localhost GRPC Incoming Permissive virtual server 2
8894 localhost HTTP Outgoing NGINX Ingress Controller egress traffic
8895 0.0.0.0 HTTP Incoming Redirect health probes 3
8896 0.0.0.0 HTTP Incoming Redirect HTTPS health probes 3

Notes:

  1. All traffic is redirected to these two ports. From there the sidecar determines the traffic type and forwards the traffic to the Main virtual server for that traffic type.

  2. The Permissive virtual server is used when permissive mTLS is configured. It’s used to accept non-mTLS traffic, for example from Pods that aren’t injected with a sidecar. See the Secure Mesh Traffic using mTLS for more information on permissive mTLS.

  3. The Kubernetes readinessProbe and livenessProbe need dedicated ports as they’re not regular in-band mTLS traffic.