Skip to main content
← All projects
L3Backend · infra· 18-28h total

Rate-limited API gateway with Redis + sliding window

Rate limiting is the single most asked-about piece of backend infra in mid-level interviews. Building one teaches concurrency safety, Redis Lua atomicity, sliding-window math, and observability under load — the same toolkit Stripe, Cloudflare, and AWS use.

Resume bullet (when finished)

Built an async FastAPI gateway with Redis-backed sliding-window rate limiting, per-key quotas, request tracing, and 1.4k RPS sustained throughput on a single 1-CPU container; tested with locust at 95th-percentile latency < 14ms.

Locked tech stack

No "choose your language" — analysis paralysis kills completion. Follow the stack to the letter on your first build.

Python 3.12FastAPIhttpxRedisLua scriptslocustDocker

Milestones (6 · ~22h)

  1. M1~3h

    FastAPI proxy skeleton

    `/proxy/{path:path}` forwards to a configured upstream via httpx; preserves method, body, headers, status.

    CHECK BEFORE MOVING ON:

    • Why httpx over requests for a proxy?
    • What header has to be rewritten and why?
    $ git commit -m "feat(proxy): FastAPI + httpx upstream forwarder"
  2. M2~3h

    Fixed-window rate limit

    Naive `INCR key EX 60` limit per API key. Returns 429 with `Retry-After`.

    CHECK BEFORE MOVING ON:

    • What's the bias in fixed-window?
    • Why does Retry-After matter for API consumers?
    $ git commit -m "feat(rl): fixed-window counter"
  3. M3~5h

    Sliding-window via Lua

    Atomic Lua script implements sliding-window (ZADD timestamps + ZREMRANGEBYSCORE). Race-condition safe.

    CHECK BEFORE MOVING ON:

    • Why Lua and not a transaction (MULTI/EXEC)?
    • What's the memory cost difference vs fixed window?
    $ git commit -m "feat(rl): atomic sliding window via Lua"
  4. M4~4h

    Per-key quotas + tier config

    API keys map to tiers (free=60/min, paid=600/min). Tier config hot-reloads from a YAML file.

    CHECK BEFORE MOVING ON:

    • Where does the tier lookup happen — every request or cached?
    • How do you handle a tier change mid-flight?
    $ git commit -m "feat(rl): tiered quotas + hot config reload"
  5. M5~4h

    Observability + load test

    Prometheus `/metrics` exposes rate, 429 rate, upstream latency. locust runs at 1.4k RPS on the dev container.

    CHECK BEFORE MOVING ON:

    • What's the difference between RED and USE metrics?
    • Why locust over wrk for this scenario?
    $ git commit -m "ops: Prometheus metrics + locust load test"
  6. M6~3h

    Docker + deploy doc

    Multi-stage Dockerfile. README's deploy section covers Fly.io and a single-node k8s.

    CHECK BEFORE MOVING ON:

    • Why multi-stage and what's in each stage?
    • Where would this break under multi-pod replicas?
    $ git commit -m "ops: multi-stage Docker + deploy guide"

60-second demo storyboard

What you say in the recruiter screen when they ask "tell me about your latest project." Practice it out loud.

  1. 0-5s: 'A FastAPI rate-limited gateway with sliding-window in Redis Lua.'
  2. 5-20s: live demo — `locust -u 200`, watch Prometheus rate metric.
  3. 20-40s: show the Lua script + 'why this is atomic and a Python loop wouldn't be'.
  4. 40-60s: 1.4k RPS, p95 < 14ms on a 1-CPU container.

STAR talking points for behavioral round

STAR — CONCURRENCY

Situation: Python-side rate-limit check + Redis update had a race window where two requests both passed the limit. Task: make it atomic. Action: ported the check-and-increment to a Lua script — Redis runs Lua atomically. Result: at 1.4k RPS the counter is never over by even 1 request.

STAR — OBSERVABILITY

Situation: 429s spiked but I didn't know if it was real traffic or a runaway client. Task: add the right signals. Action: added per-key + per-tier metrics, a P95 latency histogram, and an upstream error rate. Result: I could answer 'who is causing this' in under 10s by reading one Grafana row.

Production references — how grown-up systems do this

Stripe

Stripe's rate-limiting engineering writeup is the canonical reference — token bucket + sliding window in Redis Lua.

Redis

Redis docs on scripting + atomicity are the source of truth for why Lua is the right call.

Cloudflare

Cloudflare's rate-limiting math (the 'sliding-window approximation') is a clean read on why exact sliding window is expensive at scale.

Self-review rubric (before you claim done)

Correctness

  • Sliding window is exact (no leakage at boundary).
  • 429 includes Retry-After.
  • Tier hot-reload works without a restart.
  • Atomic under concurrent traffic — verified by load test.

Code quality

  • Lua script is its own file, not an inline string.
  • Async throughout — no sync I/O in the proxy path.
  • Tier config schema validated on load.
  • No global mutable state outside Redis.

Testing

  • Unit tests cover the Lua script with a fakeredis instance.
  • locust scenario file committed and CI-runnable.
  • Property test: 'limit never exceeded under random traffic'.

Docs

  • README has the Stripe / Cloudflare references inline.
  • Single-page architecture diagram.
  • 'How would you scale this to multi-region?' section.

✱ AI code review

Get a senior-style review before you call it done

Push your finished work to GitHub, open a PR, paste the PR URL below. Claude reviews the diff against this project's rubric and replies with strengths, must-fix items, and one teachable principle.

Tick the rubric items honestly, write the README, push to GitHub, get the AI review above. Once it's clean, email support@learnpython.academy with the repo link — we feature the best ones on /success-stories.

Need Python first? Start Foundations →