Rate Limit Policy API
Learn how to use the RateLimitPolicy API.
The RateLimitPolicy API allows Cluster Operators and Application Developers to configure NGINX Gateway Fabric to set rate limits on its provisioned NGINX instances.
The settings in RateLimitPolicy correspond to the following NGINX directives:
RateLimitPolicy is an Inherited PolicyAttachment that can be applied to a Gateway, HTTPRoute, or GRPCRoute in the same namespace as the RateLimitPolicy.
When applied to a Gateway, the settings specified in the RateLimitPolicy affect all HTTPRoutes and GRPCRoutes attached to the Gateway. This allows Cluster Operators to set defaults for all applications using the Gateway. The NGINX directives will be set at the http context.
When applied to an HTTPRoute or GRPCRoute, the settings in the RateLimitPolicy affect only the route they are applied to. This allows Application Developers to set values for their applications based on their application’s behavior or requirements. The limit_req_zone NGINX directive will be set at the http context, while the other NGINX directives will be set at the location context.
Rate Limit directives applied to an HTTPRoute or GRPCRoute will be applied alongside the Rate Limit directives applied to a Gateway. Requests to an upstream will pass through all limit_req directives at the applicable http and location contexts. If any directive’s configured rate is exceeded, the request will be delayed. As a result, there is no way for a RateLimitPolicy set on an HTTPRoute or GRPCRoute to "override" settings set by a RateLimitPolicy attached to a Gateway.
This guide will show you how to use the RateLimitPolicy API to configure rate limiting for your applications.
For all the possible configuration options for RateLimitPolicy, see the API reference.
- Install NGINX Gateway Fabric.
Create a few example applications:
kubectl apply -f https://raw.githubusercontent.com/nginx/nginx-gateway-fabric/v2.4.0/examples/rate-limit-policy/app.yamlThe example coffee, tea, and grpc-backend applications will be used to show various rate limiting configurations.
Create a Gateway:
kubectl apply -f https://raw.githubusercontent.com/nginx/nginx-gateway-fabric/v2.4.0/examples/rate-limit-policy/gateway.yamlCreate routes for the applications:
kubectl apply -f https://raw.githubusercontent.com/nginx/nginx-gateway-fabric/v2.4.0/examples/rate-limit-policy/routes.yamlThis will create HTTPRoutes for the coffee and tea applications, and a GRPCRoute for the grpc-backend.
After creating the Gateway resource, NGINX Gateway Fabric will provision an NGINX Pod and Service fronting it to route traffic. Verify the gateway is created:
kubectl describe gateways.gateway.networking.k8s.io gatewayVerify the status is Accepted:
Status:
Addresses:
Type: IPAddress
Value: 10.96.94.152
Conditions:
Last Transition Time: 2026-01-15T22:15:47Z
Message: The Gateway is accepted
Observed Generation: 1
Reason: Accepted
Status: True
Type: Accepted
Last Transition Time: 2026-01-15T22:15:47Z
Message: The Gateway is programmed
Observed Generation: 1
Reason: Programmed
Status: True
Type: ProgrammedSave the public IP address and port(s) of the Gateway into shell variables:
GW_IP=XXX.YYY.ZZZ.III
GW_PORT=<port number>In a production environment, you should have a DNS record for the external IP address that is exposed, and it should refer to the hostname that the gateway will forward for.
Test the configuration:
You can send traffic to the coffee and tea applications using the external IP address and port for the NGINX Service.
Send a request to coffee:
curl --resolve cafe.example.com:$GW_PORT:$GW_IP http://cafe.example.com:$GW_PORT/coffeeThis request should receive a response from the coffee Pod:
Server address: 10.244.0.22:8080
Server name: coffee-654ddf664b-6mwtbSend a request to tea:
curl --resolve cafe.example.com:$GW_PORT:$GW_IP http://cafe.example.com:$GW_PORT/teaThis request should receive a response from the tea Pod:
Server address: 10.244.0.23:8080
Server name: tea-75bc9f4b6d-g9t84Before we enable rate limiting, try sending multiple requests to coffee:
for i in `seq 1 10`; do curl --resolve cafe.example.com:$GW_PORT:$GW_IP http://cafe.example.com:$GW_PORT/coffee; doneYou should see all successful responses in quick succession as we have not configured any rate limiting rules yet.
To set rate limit settings for the coffee HTTPRoute created during setup, add the following RateLimitPolicy:
kubectl apply -f - <<EOF
apiVersion: gateway.nginx.org/v1alpha1
kind: RateLimitPolicy
metadata:
name: route-rate-limit
spec:
targetRefs:
- group: gateway.networking.k8s.io
kind: HTTPRoute
name: coffee
rateLimit:
local:
rules:
- zoneSize: 10m
key: "\$binary_remote_addr"
rate: 1r/s
burst: 3
EOFThis RateLimitPolicy targets the coffee HTTPRoute we created in the setup by specifying it in the targetRefs field. It configures the following rate limit settings:
zoneSize: "10m": Sets the zoneSize that will keep states for various keys.key: "$binary_remote_addr": Sets the key to which the rate limit is applied to be the client IP address. This means that requests from different client IP addresses get rate limited separately.rate: "1r/s": Sets the rate of requests permitted per key to 1 request per second.burst: "3": Sets the maximum burst size of requests to 3. Excessive requests are delayed until their number exceeds the maximum burst size of 3, in which case the request is terminated with an error.
Verify that the RateLimitPolicy is Accepted:
kubectl describe ratelimitpolicies.gateway.nginx.org route-rate-limitYou should see the following status:
Status:
Ancestors:
Ancestor Ref:
Group: gateway.networking.k8s.io
Kind: HTTPRoute
Name: coffee
Namespace: default
Conditions:
Last Transition Time: 2026-01-15T22:17:46Z
Message: The Policy is accepted
Observed Generation: 1
Reason: Accepted
Status: True
Type: Accepted
Controller Name: gateway.nginx.org/nginx-gateway-controller
Events: <none>You can also verify that the policy was applied to the HTTPRoute by checking the HTTPRoute’s status:
kubectl describe httproute coffeeLook for the RateLimitPolicyAffected condition in the HTTPRoute status:
Status:
Conditions:
Last Transition Time: 2026-01-15T22:17:46Z
Message: The RateLimitPolicy is applied to the resource
Observed Generation: 1
Reason: PolicyAffected
Status: True
Type: RateLimitPolicyAffectedThis condition indicates that a RateLimitPolicy has been successfully applied to the HTTPRoute.
Test the configuration by sending a request to the coffee application:
curl --resolve cafe.example.com:$GW_PORT:$GW_IP http://cafe.example.com:$GW_PORT/coffeeThis request should receive a response from the coffee Pod:
Server address: 10.244.0.22:8080
Server name: coffee-654ddf664b-6mwtbWhen processing a single request, the rate limiting configuration has no noticeable effect. Try to exceed the set rate limit with a script that sends multiple requests.
for i in `seq 1 10`; do curl --resolve cafe.example.com:$GW_PORT:$GW_IP http://cafe.example.com:$GW_PORT/coffee; doneYou should see all successful responses from the coffee Pod, but they should be spaced apart roughly one second each as expected through the rate limiting configuration.
Using the same script on the tea application, we can see there are no rate limit settings applied to the tea application:
for i in `seq 1 10`; do curl --resolve cafe.example.com:$GW_PORT:$GW_IP http://cafe.example.com:$GW_PORT/tea; doneRateLimitPolicy can also target GRPCRoutes. To do so, re-use the policy created for the coffee HTTPRoute and add an additional targetRef:
kubectl apply -f - <<EOF
apiVersion: gateway.nginx.org/v1alpha1
kind: RateLimitPolicy
metadata:
name: route-rate-limit
spec:
targetRefs:
- group: gateway.networking.k8s.io
kind: HTTPRoute
name: coffee
- group: gateway.networking.k8s.io
kind: GRPCRoute
name: grpc-route
rateLimit:
local:
rules:
- zoneSize: 10m
key: "\$binary_remote_addr"
rate: 1r/s
burst: 3
EOFThis will let the RateLimitPolicy affect both the coffee HTTPRoute and the grpc-route GRPCRoute.
RateLimitPolicy does not allow mixing Gateway kind with HTTPRoute or GRPCRoute kinds in targetRefs.
Verify that the RateLimitPolicy is Accepted:
kubectl describe ratelimitpolicies.gateway.nginx.org route-rate-limitYou should see the following status:
Status:
Ancestors:
Ancestor Ref:
Group: gateway.networking.k8s.io
Kind: HTTPRoute
Name: coffee
Namespace: default
Conditions:
Last Transition Time: 2026-01-15T22:19:35Z
Message: The Policy is accepted
Observed Generation: 2
Reason: Accepted
Status: True
Type: Accepted
Controller Name: gateway.nginx.org/nginx-gateway-controller
Ancestor Ref:
Group: gateway.networking.k8s.io
Kind: GRPCRoute
Name: grpc-route
Namespace: default
Conditions:
Last Transition Time: 2026-01-15T22:19:35Z
Message: The Policy is accepted
Observed Generation: 2
Reason: Accepted
Status: True
Type: Accepted
Controller Name: gateway.nginx.org/nginx-gateway-controller
Events: <none>You can also verify that the policy was applied to the GRPCRoute by checking the GRPCRoute’s status:
kubectl describe grpcroute grpc-routeLook for the RateLimitPolicyAffected condition in the GRPCRoute status:
Status:
Conditions:
Last Transition Time: 2026-01-15T22:19:35Z
Message: The RateLimitPolicy is applied to the resource
Observed Generation: 1
Reason: PolicyAffected
Status: True
Type: RateLimitPolicyAffectedThis condition indicates that a RateLimitPolicy has been successfully applied to the GRPCRoute.
To access the application and test the RateLimitPolicy has been applied to the GRPCRoute, we will use grpcurl.
To test our application, we will need to create a separate .proto source file since we are running things locally.
Create a new file named grpc.proto with these contents:
syntax = "proto3";
option go_package = "google.golang.org/grpc/examples/helloworld/helloworld";
option java_multiple_files = true;
option java_package = "io.grpc.examples.helloworld";
option java_outer_classname = "HelloWorldProto";
package helloworld;
// The greeting service definition.
service Greeter {
// Sends a greeting
rpc SayHello (HelloRequest) returns (HelloReply) {}
}
// The request message containing the user's name.
message HelloRequest {
string name = 1;
}
// The response message containing the greetings
message HelloReply {
string message = 1;
}In the same directory, test our gRPC application through grpcurl:
grpcurl -plaintext -proto grpc.proto -authority grpc.example.com -d '{"name": "exact"}' ${GW_IP}:${GW_PORT} helloworld.Greeter/SayHelloThe request should receive a response from the gRPC backend:
{
"message": "Hello exact"
}When processing a single request, the rate limiting configuration has no noticeable effect. Try to exceed the set rate limit with a script that sends multiple requests.
for i in `seq 1 10`; do grpcurl -plaintext -proto grpc.proto -authority grpc.example.com -d '{"name": "exact"}' ${GW_IP}:${GW_PORT} helloworld.Greeter/SayHello; doneYou should see all successful responses from the grpc-backend, but they should be spaced apart roughly one second each as expected through the rate limiting configuration.
Before setting rate limiting on the Gateway, verify that the tea application isn’t affected by any rate limit configuration.
for i in `seq 1 10`; do curl --resolve cafe.example.com:$GW_PORT:$GW_IP http://cafe.example.com:$GW_PORT/tea; doneYou should see that these requests have successful responses with no delays.
To set rate limit settings for the Gateway we created in the setup, add the following RateLimitPolicy:
kubectl apply -f - <<EOF
apiVersion: gateway.nginx.org/v1alpha1
kind: RateLimitPolicy
metadata:
name: gateway-rate-limit
spec:
targetRefs:
- group: gateway.networking.k8s.io
kind: Gateway
name: gateway
rateLimit:
local:
rules:
- zoneSize: 10m
key: "\$binary_remote_addr"
rate: 10r/s
EOFThis RateLimitPolicy targets the Gateway we created in the setup by specifying it in the targetRefs field. Additionally, it omits the burst field, meaning any excessive requests will cause an error code to be returned.
Since this policy is applied to the Gateway, it will affect all HTTPRoutes and GRPCRoutes attached to the Gateway.
Verify that the RateLimitPolicy is Accepted:
kubectl describe ratelimitpolicies.gateway.nginx.org gateway-rate-limitYou should see the following status:
Status:
Ancestors:
Ancestor Ref:
Group: gateway.networking.k8s.io
Kind: Gateway
Name: gateway
Namespace: default
Conditions:
Last Transition Time: 2026-01-15T22:23:38Z
Message: The Policy is accepted
Observed Generation: 1
Reason: Accepted
Status: True
Type: Accepted
Controller Name: gateway.nginx.org/nginx-gateway-controller
Events: <none>You can also verify that the policy was applied to the Gateway by checking the Gateway’s status:
kubectl describe gateway gatewayLook for the RateLimitPolicyAffected condition in the Gateway status:
Status:
Conditions:
Last Transition Time: 2026-01-15T22:23:38Z
Message: The RateLimitPolicy is applied to the resource
Observed Generation: 1
Reason: PolicyAffected
Status: True
Type: RateLimitPolicyAffectedThis condition indicates that a RateLimitPolicy has been successfully applied to the Gateway.
Verify that the tea application is affected by the rate limit configuration:
for i in `seq 1 10`; do curl --resolve cafe.example.com:$GW_PORT:$GW_IP http://cafe.example.com:$GW_PORT/tea; doneEven though there is no RateLimitPolicy specifically targeting the tea HTTPRoute, since the HTTPRoute is attached to the Gateway, it is affected by the policy. There should be a mix of successful responses alongside 503 error codes (default error code). This is expected because of the omission of the burst field in this RateLimitPolicy, causing the excessive requests to have an error code returned as the response instead of being delayed. For more information on underlying NGINX configuration of the limit_req_module, see the official NGINX documentation.
The RateLimitPolicy attached to the Gateway will affect all Routes, meaning the coffee HTTPRoute and grpc-route GRPCRoute are also affected by this policy. In NGINX, a request will pass through all limit_req directives that apply to it, if any of the rates are exceeded, that request will be delayed and subject to the rules of that limit_req. In this case, since the RateLimitPolicy attached to the Routes has the same key and has a lower rate of 1r/s, requests sent to the coffee and grpc-backend under the rate of 10r/s will still get delayed by the 1r/s rate.
To test this, send five requests to the coffee application:
for i in `seq 1 5`; do curl --resolve cafe.example.com:$GW_PORT:$GW_IP http://cafe.example.com:$GW_PORT/coffee; doneThe requests will each be delayed at a rate of one request per second.
For each request that is rate limited by NGINX, a log message is associated with it.
Send a few requests to both tea and coffee applications:
for i in `seq 1 5`; do curl --resolve cafe.example.com:$GW_PORT:$GW_IP http://cafe.example.com:$GW_PORT/coffee; donefor i in `seq 1 5`; do curl --resolve cafe.example.com:$GW_PORT:$GW_IP http://cafe.example.com:$GW_PORT/tea; doneView the NGINX logs:
kubectl logs <gateway-nginx-pod-name>127.0.0.1 - - [15/Jan/2026:20:44:57 +0000] "GET /coffee HTTP/1.1" 200 161 "-" "curl/8.7.1"
2026/01/15 20:44:57 [warn] 5788#5788: *2927 delaying request, excess: 0.943, by zone "default_rl_route-rate-limit_rule0", client: 127.0.0.1, server: cafe.example.com, request: "GET /coffee HTTP/1.1", host: "cafe.example.com:8080"
127.0.0.1 - - [15/Jan/2026:20:44:58 +0000] "GET /coffee HTTP/1.1" 200 161 "-" "curl/8.7.1"
2026/01/15 20:44:58 [info] 5788#5788: *2925 client 127.0.0.1 closed keepalive connection
2026/01/15 20:44:58 [warn] 5788#5788: *2930 delaying request, excess: 0.885, by zone "default_rl_route-rate-limit_rule0", client: 127.0.0.1, server: cafe.example.com, request: "GET /coffee HTTP/1.1", host: "cafe.example.com:8080"
127.0.0.1 - - [15/Jan/2026:20:44:59 +0000] "GET /coffee HTTP/1.1" 200 161 "-" "curl/8.7.1"
...
127.0.0.1 - - [15/Jan/2026:20:45:05 +0000] "GET /tea HTTP/1.1" 200 155 "-" "curl/8.7.1"
127.0.0.1 - - [15/Jan/2026:20:45:05 +0000] "GET /tea HTTP/1.1" 503 190 "-" "curl/8.7.1"
2026/01/15 20:45:05 [error] 5788#5788: *2938 limiting requests, excess: 0.500 by zone "default_rl_gateway-rate-limit_rule0", client: 127.0.0.1, server: cafe.example.com, request: "GET /tea HTTP/1.1", host: "cafe.example.com:8080"
127.0.0.1 - - [15/Jan/2026:20:45:05 +0000] "GET /tea HTTP/1.1" 200 155 "-" "curl/8.7.1"
127.0.0.1 - - [15/Jan/2026:20:45:05 +0000] "GET /tea HTTP/1.1" 503 190 "-" "curl/8.7.1"
2026/01/15 20:45:05 [error] 5788#5788: *2941 limiting requests, excess: 0.610 by zone "default_rl_gateway-rate-limit_rule0", client: 127.0.0.1, server: cafe.example.com, request: "GET /tea HTTP/1.1", host: "cafe.example.com:8080"
...You should be able to see NGINX logs at the default error log level showing NGINX limiting requests. Since the coffee application has burst=3, the requests are delayed and are all met with the 200 status code, while the tea application doesn’t have burst, meaning all excessive requests are met with a 503.
Both the error log level and status code are adjustable in the RateLimitPolicy:
kubectl apply -f - <<EOF
apiVersion: gateway.nginx.org/v1alpha1
kind: RateLimitPolicy
metadata:
name: gateway-rate-limit
spec:
targetRefs:
- group: gateway.networking.k8s.io
kind: Gateway
name: gateway
rateLimit:
local:
rules:
- zoneSize: 10m
key: "\$binary_remote_addr"
rate: 10r/s
logLevel: "warn"
rejectCode: 429
EOFThis will change the logLevel to warn and the rejectCode to 429.
Send some requests to the tea application with the new logLevel and rejectCode:
for i in `seq 1 5`; do curl --resolve cafe.example.com:$GW_PORT:$GW_IP http://cafe.example.com:$GW_PORT/tea; doneViewing the NGINX logs we can see the changes:
kubectl logs <gateway-nginx-pod-name>127.0.0.1 - - [21/Jan/2026:18:55:39 +0000] "GET /tea HTTP/1.1" 200 155 "-" "curl/8.7.1"
127.0.0.1 - - [21/Jan/2026:18:55:39 +0000] "GET /tea HTTP/1.1" 429 162 "-" "curl/8.7.1"
2026/01/21 18:55:39 [warn] 1290#1290: *595 limiting requests, excess: 0.580 by zone "default_rl_gateway-rate-limit_rule0", client: 127.0.0.1, server: cafe.example.com, request: "GET /tea HTTP/1.1", host: "cafe.example.com:8080"
127.0.0.1 - - [21/Jan/2026:18:55:39 +0000] "GET /tea HTTP/1.1" 200 155 "-" "curl/8.7.1"
127.0.0.1 - - [21/Jan/2026:18:55:39 +0000] "GET /tea HTTP/1.1" 429 162 "-" "curl/8.7.1"
2026/01/21 18:55:39 [warn] 1292#1292: *598 limiting requests, excess: 0.470 by zone "default_rl_gateway-rate-limit_rule0", client: 127.0.0.1, server: cafe.example.com, request: "GET /tea HTTP/1.1", host: "cafe.example.com:8080"
2026/01/21 18:55:39 [warn] 1293#1293: *599 limiting requests, excess: 0.020 by zone "default_rl_gateway-rate-limit_rule0", client: 127.0.0.1, server: cafe.example.com, request: "GET /tea HTTP/1.1", host: "cafe.example.com:8080"
127.0.0.1 - - [21/Jan/2026:18:55:39 +0000] "GET /tea HTTP/1.1" 429 162 "-" "curl/8.7.1"Since Routes are affected by all RateLimitPolicies on a Gateway, there is no way for a RateLimitPolicy attached to a Route to overwrite/negate a rule set by one on a Gateway.
When multiple RateLimitPolicies select the same targetRef and specify any of dryRun, logLevel, or rejectCode, only one policy will be applied. The controller selects the policy with the highest priority (based on time created, if created at the same time, ties are calculated on alphabetical order sorting of the policy name) and rejected policies will have the Accepted Condition set to false with the reason Conflicted.
The limit_req_dry_run NGINX directive can be enabled by setting spec.rateLimit.dryRun to true. In this mode, rate limit is not applied, but the number of excessive requests is accounted as usual in the shared memory zone.
In NGINX, there is a measure of protection against modifying the key of an existing limit_req_zone, if done so, this error message will show up in the NGINX logs and the worker processes won’t be reloaded, meaning the new configuration won’t be in effect:
2026/01/21 23:45:45 [emerg] 32645#32645: limit_req (a zone name) uses the (new key name) key while previously it used the (old key name) keyThis will occur in the RateLimitPolicy if the key for a rule is modified after creation of the policy. This won’t occur upon first creation of the RateLimitPolicy, or when adding new rules, only when modifying an existing rule.
A workaround is to either delete and re-deploy the RateLimitPolicy or to modify the zoneSize field alongside the key. Modifying the zoneSize will result in the creation of a new zone and a reloading of NGINX processes.
You can verify the worker processes have been reloaded by checking for these NGINX logs:
...
2026/01/22 19:17:24 [notice] 12#12: start worker process 85
2026/01/22 19:17:24 [notice] 12#12: start worker process 86
2026/01/22 19:17:24 [notice] 12#12: start worker process 87
2026/01/22 19:17:24 [notice] 48#48: gracefully shutting down
2026/01/22 19:17:24 [notice] 49#49: gracefully shutting down
...
time=2026-01-22T19:17:27.978Z level=INFO msg="NGINX workers have been reloaded" correlation_id=e0c6953c-cb1d-45fd-826f-0bf66a192700 server_type=command
time=2026-01-22T19:17:27.978Z level=INFO msg="NGINX reloaded" process_id=12 correlation_id=e0c6953c-cb1d-45fd-826f-0bf66a192700 server_type=command- NGINX limit_req_module: for more information on the underlying NGINX directives.
- Custom policies: learn about how NGINX Gateway Fabric custom policies work.
- API reference: all configuration fields for the
RateLimitPolicyAPI.