Rate Limiting
Tech Preview: This feature is currently in Tech Preview and is subject to change.
AuthZed Dedicated, AuthZed Cloud and SpiceDB Enterprise include a distributed rate limiting feature that allows you to control API request rates using flexible matching and bucketing rules. Rate limits are configured via YAML and can be applied globally, per-endpoint, per-service-account, or using custom CEL expressions.
This feature works seamlessly with Restricted API Access to provide comprehensive control over how your services interact with AuthZed.
Overview
The rate limiting feature provides:
- Flexible Matching: Apply rate limits based on endpoints, service accounts, roles, headers, or custom CEL expressions
- Custom Bucketing: Group requests into rate limit buckets by service account, token, headers, or custom logic
- Distributed Coordination: Coordinate rate limits globally across multiple replicas
- Graceful Degradation: Automatically adjusts limits when coordination is unavailable
Configuration
The process for configuring rate limiting varies depending on the AuthZed product you’re using.
Dedicated & Cloud
Rate limits are configured using the same FGAM configuration file used for Restricted API Access.
Upload your FGAM configuration file (which can include both Restricted API Access and rate limiting rules) through the web dashboard in the Permission System’s “Access” tab.
Create a YAML file with your rate limit definitions:
rate_limits:
# Global rate limit (applies to all requests)
- id: "global-limit"
displayName: "Global API Rate Limit"
match:
all: true
limit:
unit: "second"
requests_per_unit: 1000
# Per-endpoint rate limit
- id: "check-permission-limit"
displayName: "CheckPermission Rate Limit"
match:
endpoint: ["CheckPermission"]
limit:
unit: "second"
requests_per_unit: 500
# Multiple endpoints
- id: "read-endpoints-limit"
displayName: "Read Endpoints Rate Limit"
match:
endpoint:
- "CheckPermission"
- "ReadRelationships"
limit:
unit: "second"
requests_per_unit: 1000
# Per-service-account with bucketing
- id: "sa-limit"
displayName: "Service Account Limit"
match:
service_account: ["high-volume-client"]
bucket_by:
service_account: true
limit:
unit: "minute"
requests_per_unit: 10000
# Using headers for tenant-based rate limiting
- id: "tenant-limit"
displayName: "Per-Tenant Rate Limit"
match:
endpoint:
- "CheckPermission"
- "ReadRelationships"
bucket_by:
request: 'headers["x-tenant-id"]'
limit:
unit: "second"
requests_per_unit: 100For Dedicated & Cloud, the rate limiting configuration is applied through the FGAM file upload. There is no separate UI or API for rate limiting configuration at this time.
Rate Limit Configuration Reference
Matching Criteria
Every rate limit must specify at least one match criterion. All fields within a match use AND logic (all conditions must be true).
Available Match Fields
all: Matches all requests (must be the only field in match)endpoint: Array of API method names (OR logic within array)service_account: Array of FGAM service account IDs (OR logic within array)role: Array of FGAM role names (OR logic within array)header: Array of header match objects (OR logic within array)request: CEL expression for complex matching logic
Match Examples
rate_limits:
# Global rate limit
- id: "global"
match:
all: true
limit:
unit: "second"
requests_per_unit: 1000
# Single endpoint
- id: "single-endpoint"
match:
endpoint: ["CheckPermission"]
limit:
unit: "second"
requests_per_unit: 100
# Multiple endpoints (OR logic)
- id: "multiple-endpoints"
match:
endpoint:
- "CheckPermission"
- "ReadRelationships"
- "LookupResources"
limit:
unit: "second"
requests_per_unit: 200
# Endpoint AND role (both must match)
- id: "admin-reads"
match:
endpoint: ["ReadRelationships"]
role: ["admin"]
limit:
unit: "minute"
requests_per_unit: 5000
# Header matching (single header)
- id: "premium-tier"
match:
header:
- name: "x-tier"
value: "premium"
limit:
unit: "second"
requests_per_unit: 500
# Multiple headers (OR logic)
- id: "high-tier"
match:
header:
- name: "x-tier"
value: "premium"
- name: "x-tier"
value: "enterprise"
limit:
unit: "second"
requests_per_unit: 1000CEL Expressions
Use CEL expressions for advanced matching and bucketing logic. CEL expressions have access to:
endpoint: The API endpoint stringserviceAccount: The service account IDheadersormeta: gRPC metadata headers asmap[string]string- Request fields: Access request proto fields (e.g.,
CheckPermissionRequest.resource.object_type)
CEL Match Examples
rate_limits:
# Pattern matching on service account
- id: "batch-services"
match:
request: 'serviceAccount.startsWith("batch-")'
limit:
unit: "minute"
requests_per_unit: 50000
# Complex cross-field logic
- id: "premium-endpoints"
match:
request: |
(endpoint in ["CheckPermission", "ReadRelationships"]) &&
(headers.get("x-tier", "") in ["premium", "enterprise"])
limit:
unit: "second"
requests_per_unit: 2000
# Request content filtering
- id: "document-checks"
displayName: "Per-Document Check Limit"
match:
endpoint: ["CheckPermission"]
request: 'CheckPermissionRequest.resource.object_type == "document"'
limit:
unit: "second"
requests_per_unit: 10
# Conditional based on request size
- id: "bulk-writes"
match:
endpoint: ["WriteRelationships"]
request: "size(WriteRelationshipsRequest.updates) > 100"
limit:
unit: "minute"
requests_per_unit: 100Bucketing
Bucketing determines how requests are grouped into separate rate limit counters.
Bucketing Options
service_account: true: Separate bucket per service accounttoken: true: Separate bucket per API tokenheader: "<header-name>": Separate bucket per header valuerequest: "<CEL-expression>": Custom bucketing logic via CEL
Bucketing Examples
rate_limits:
# Per-service-account bucketing
- id: "per-sa"
match:
all: true
bucket_by:
service_account: true
limit:
unit: "second"
requests_per_unit: 100
# Per-tenant bucketing using header
- id: "per-tenant"
match:
endpoint: ["CheckPermission"]
bucket_by:
request: 'headers["x-tenant-id"]'
limit:
unit: "second"
requests_per_unit: 50
# Bucket by request field
- id: "per-document"
match:
endpoint: ["CheckPermission"]
request: 'CheckPermissionRequest.resource.object_type == "document"'
bucket_by:
request: "CheckPermissionRequest.resource.object_id"
limit:
unit: "second"
requests_per_unit: 10
# Complex bucketing combining multiple values
- id: "composite-bucket"
match:
endpoint:
- "CheckPermission"
- "ReadRelationships"
bucket_by:
request: |
endpoint + "/" +
headers.get("x-tenant-id", "default") + "/" +
serviceAccount
limit:
unit: "minute"
requests_per_unit: 1000Rate Limit Units
The unit field supports:
"second""minute""hour""day"
You can also specify custom durations using Go duration syntax (e.g., "30s", "15m", "2h", "90s").
Self-Hosted Configuration
The following sections apply only to self-hosted SpiceDB Enterprise deployments.
Basic Setup
For self-hosted SpiceDB Enterprise deployments, use the following command-line flag:
| Flag | Description | Default |
|---|---|---|
--rate-limit-config | Path to YAML file containing rate limit definitions |
spicedb serve \
--rate-limit-config=/path/to/config.yaml \
...The YAML file follows the same format as shown in the configuration examples above.
Distributed Rate Limiting
Distributed rate limiting with gossip coordination is only configurable for self-hosted SpiceDB Enterprise deployments. AuthZed Dedicated handles this automatically.
For self-hosted deployments, you can enable distributed coordination across replicas using gossip for accurate global rate limits.
Enabling Gossip
spicedb serve \
--rate-limit-config=/path/to/config.yaml \
--rate-limit-gossip-enabled=true \
--rate-limit-gossip-listen-addr=:6000 \
--rate-limit-gossip-target-service=spicedb \
--rate-limit-gossip-port-name=gossip \
--rate-limit-gossip-replicas=3 \
--rate-limit-gossip-use-dispatch-tls=true \
...Gossip Configuration Flags
| Flag | Default | Description |
|---|---|---|
--rate-limit-gossip-enabled | false | Enable distributed rate limiting via gossip |
--rate-limit-gossip-listen-addr | :6000 | Address for gossip connections |
--rate-limit-gossip-target-service | spicedb | Kubernetes service name for peer discovery |
--rate-limit-gossip-port-name | "" | Port name to use for peer addresses |
--rate-limit-gossip-replicas | 1 | Number of replicas for rate division |
--rate-limit-gossip-use-dispatch-tls | false | Use dispatch TLS certificates for gossip |
--rate-limit-gossip-tls-cert | "" | TLS certificate for gossip |
--rate-limit-gossip-tls-key | "" | TLS key for gossip |
--rate-limit-gossip-tls-ca | "" | TLS CA for mutual TLS |
--rate-limit-gossip-tls-server-name | "" | Server name for TLS verification |
Monitoring
For self-hosted SpiceDB Enterprise deployments, rate limiting exposes Prometheus metrics for monitoring:
| Metric | Type | Description |
|---|---|---|
spicedb_ratelimit_check_latency_seconds | Histogram | Rate limit check latency |
spicedb_ratelimit_gossip_messages_sent_total | Counter | Gossip messages sent |
spicedb_ratelimit_gossip_messages_dropped_total | Counter | Messages dropped (buffer full) |
spicedb_ratelimit_gossip_peers_active | Gauge | Active peer connections |
spicedb_ratelimit_gossip_connection_errors_total | Counter | Connection failures |
Monitor the spicedb_ratelimit_gossip_peers_active metric to ensure gossip coordination is healthy.
Error Responses
When a rate limit is exceeded, the API returns:
- gRPC Status Code:
RESOURCE_EXHAUSTED - Response Trailers:
x-ratelimit-id: The rate limit ID that was exceededx-ratelimit-key: The bucket keyretry-after: Seconds until the client can retry
Example error handling in Go:
import (
"google.golang.org/grpc/codes"
"google.golang.org/grpc/status"
)
resp, err := client.CheckPermission(ctx, req)
if err != nil {
if st, ok := status.FromError(err); ok {
if st.Code() == codes.ResourceExhausted {
// Rate limit exceeded
trailer := // extract trailer metadata
rateLimitID := trailer.Get("x-ratelimit-id")
retryAfter := trailer.Get("retry-after")
// Implement backoff logic
log.Printf("Rate limit %s exceeded, retry after %s seconds",
rateLimitID, retryAfter)
}
}
}Troubleshooting
Rate Limits Not Applied
- Verify the configuration file is being loaded with
--rate-limit-config - Check logs for configuration parsing errors
- Ensure match criteria are correctly specified (arrays for endpoints, service accounts, etc.)
Gossip Connectivity Issues
- Verify the gossip port (default
:6000) is accessible between pods - Check TLS configuration if using encrypted gossip
- Monitor
spicedb_ratelimit_gossip_peers_active- should equalreplicas - 1 - Review
spicedb_ratelimit_gossip_connection_errors_totalfor connectivity problems
Rate Limits Too Restrictive in Safe Mode
- Increase
--rate-limit-gossip-replicasif it doesn’t match actual deployment - Fix gossip connectivity to enable coordinated mode
- Consider adjusting base rate limits to account for safe mode operation
CEL Expression Errors
- Test CEL expressions with representative requests
- Use
.get("key", "default")for optional headers - Check logs for CEL evaluation errors