Scaling Guidance
Learn how to estimate the size needed for your NGINXaaS for Azure deployment and scale to match it.
NGINXaaS for Azure (NGINXaaS) supports specifying the deployment size at creation time, or updating it later. This allows you to adjust to application traffic requirements and control cost.
In this document you will learn:
- What an NGINX Capacity Unit (NCU) is
- How to estimate the amount of capacity to provision
- How to specify this capacity, including what restrictions apply
- How to monitor capacity usage and to iteratively refine the allocated capacity
NGINX Capacity Unit (NCU)
An NGINX Capacity Unit (NCU) quantifies the capacity of an NGINX instance based on the underlying compute resources. This abstraction allows you to specify the desired capacity in NCUs without having to consider the regional hardware differences.
An NGINX Capacity Unit consists of the following parameters:
- CPU: an NCU provides 20 Azure Compute Units (ACUs)
- Bandwidth: an NCU provides 60 Mbps of network throughput
- Concurrent connections: an NCU provides 400 concurrent connections
To calculate how many NCUs to provision, take the maximum across these metrics.
Example 1: “I need to support 2,000 concurrent connections but only 4 Mbps of traffic. I need 52 ACUs.” You would need Max(52/20, 4/60, 2000/400)
= Max(2.6, 0.07, 5)
= At least 5 NCUs.
Example 2: “I don’t know any of these yet!” Start with the minimum and adjust capacity with the iterative approach described below.
In addition to the maximum capacity needed, we recommend adding a 10% to 20% buffer of additional NCUs to account for unexpected increases in traffic. Monitor the NCUs Consumed metric over time to determine your peak usage levels and adjust your requested capacity accordingly.
Adjusting Capacity
An NGINXaaS deployment can be scaled out to increase the capacity (increasing the cost) or scaled in to decrease the capacity (reducing the cost). Capacity is measured in NCUs.
To update the capacity of your deployment
-
Select NGINXaaS scaling in the left menu.
-
Set the desired number of NCUs.
-
Click Submit to update your deployment.
Capacity Restrictions
The following table outlines constraints on the specified capacity based on the chosen Marketplace plan, including the minimum capacity required for a deployment to be highly available, the maximum capacity, and what value the capacity must be a multiple of. By default, an NGINXaaS for Azure deployment will be created with the corresponding minimum capacity.
Marketplace Plan | Minimum Capacity (NCUs) | Maximum Capacity (NCUs) | Multiple of |
---|---|---|---|
Standard | 10 | 500 | 10 |
Note:
If you need a higher maximum capacity, please open a request and specify the Resource ID of your NGINXaaS deployment, the region, and the desired maximum capacity you wish to scale to.
Connection Processing Methods Restrictions
- NGINXaaS only supports the
epoll
connection processing method when using theuse
directive, as NGINXaaS is based on Linux.
Metrics
NGINXaaS provides metrics for visibility of the current and historical capacity values. These metrics, in the NGINXaaS Statistics
namespace, include:
- NCUs Requested:
ncu.requested
– how many NCUs have been requested via the API. This is the goal state of the system at that point in time. - NCUs Provisioned:
ncu.provisioned
– how many NCUs have been successfully provisioned by the service.- This is the basis for billing.
- This may differ from
ncu.requested
temporarily during scale-out/scale-in events or during automatic remediation for a hardware failure.
- NCUs Consumed:
ncu.consumed
– how many NCUs the current workload is using.- If this is much lower than the requested capacity consider scaling in to reduce costs. If this is close to, or above, the requested capacity consider scaling out, otherwise requests may fail or take longer than expected.
- This value may burst higher than
ncu.requested
due to variation in provisioned hardware. You will still only be billed for the minimum ofncu.requested
andncu.provisioned
.
See the Metrics Catalog for a reference of all metrics.
Note:
These metrics aren’t visible unless enabled, see how to Enable Monitoring for details.
Iterative approach
- Make an estimate by either:
- using the Usage and Cost Estimator
- compare to a reference workload
- Observe the
ncu.consumed
metric in Azure Monitor of your workload - Decide what headroom factor you wish to have
- Multiply the headroom factor by the consumed NCUs to get the target NCUs.
- Adjust capacity to the target NCUs
- repeat from step 2 – it is always good to check back after making a change
Example:
- I am really unsure what size I needed so I just specified the default capacity,
20NCUs
. - I observe that my
ncu.consumed
is currently at18NCUs
. - This is early morning, traffic. I think midday traffic could be 3x what it is now.
18 * 3 = 54
is my target capacity.- I can see that I need to scale by multiples of 10 so I’m going to scale out to
60NCUs
. - At midday I can see that I overestimated the traffic I would be getting and it was still a busy day. We peaked at
41NCUs
, let me scale in to50NCUs
to reduce my cost.
Reference workloads
These reference workloads were all measured with a simplistic NGINX config proxying requests to an upstream. Keepalive between NGINX and upstream is enabled. Minimal request matching or manipulation is done.
TLS? | Conn/s | Req/s | Response Size | Throughput | NCU |
---|---|---|---|---|---|
no | 12830 | 13430 | 0KB | 23Mbps | 18.8 |
no | 12080 | 13046 | 1KB | 125Mbps | 19 |
no | 12215 | 12215 | 10KB | 953Mbps | 21 |
no | 1960 | 1690 | 100KB | 1295Mbps | 23.6 |