Rate limiting
Production-grade rate limiting so your API doesn't get abused.
Protect your endpoints
Without rate limiting, one script can hammer your login endpoint a thousand times per second. Your database melts. Your Stripe integration gets flooded with checkout attempts. Your AI features burn through API credits in minutes.
Inside Claude Code, type /rate-limit. It sets up production-grade rate limiting across your application. Auth endpoints, API procedures, and any resource-intensive routes get protection based on their specific needs.
What it does
Different endpoints need different limits. A login endpoint should allow a few attempts per minute per IP. A search endpoint might allow more, but not unlimited. An AI generation endpoint that costs you money per call needs tight limits with clear feedback to the user.
The system configures limits based on what each endpoint does. It adds the rate limiting middleware, configures the storage backend, and sets up proper error responses. When a user hits a limit, they get a clear message with a retry-after header, not a cryptic 500 error.
How it's implemented
Rate limiting runs as middleware, so it catches abuse before your procedures even execute. The limits are configurable per route, and the storage can be swapped between in-memory for development and a persistent store for production.
The implementation handles the edge cases that trip people up: distributed rate limiting across multiple server instances, proper handling of proxy headers for accurate IP detection, and different limit tiers for authenticated versus anonymous requests.
/rate-limit