Model the impact of an API rate limit on a workload by computing expected throughput, queue depth, and tail latency under M/M/1 assumptions.
Given an API's rate limit and a workload's offered load, computes expected throughput, mean queue depth, and 95th-percentile wait time using M/M/1 (or M/M/c) approximations. Output is a one-page markdown brief with numbers and a recommendation.
rate_limit_rps: the steady-state allowed request rate (requests per second).burst_capacity: token-bucket burst size (defaults to rate_limit_rps).offered_load_rps: the workload's request rate (mean).concurrency: number of parallel callers (defaults to 1, M/M/1).mean_service_time_ms: per-request server time; defaults to 1000 / rate_limit_rps.rho = offered_load_rps / (concurrency * rate_limit_rps).rho >= 1, the queue grows unbounded: emit a "saturation" brief that reports time-to-fill the burst bucket = burst_capacity / (offered_load_rps - rate_limit_rps).L_q = rho^2 / (1 - rho) for M/M/1.W_q = L_q / offered_load_rps.W_q95 = W_q * 3 (rule of thumb for exponential distributions).effective_throughput_rps = min(offered_load_rps, concurrency * rate_limit_rps).burst_capacity): P_drop = (1 - rho) * rho^N / (1 - rho^(N+1)) where N = buffer.rho > 0.8, suggest scaling concurrency or negotiating the limit; if 0.5 < rho <= 0.8, suggest jittered backoff.rate-limit-model.md.rate-limit-model.md with: an inputs block, a computed-results table (rho, L_q, W_q, p95, throughput, drop probability), and a recommendation paragraph. Stdout prints rho and effective throughput for quick consumption.
Plug the inputs into a Monte Carlo simulator (a 10-line Python script using random.expovariate) for 100k synthetic requests; verify the simulated mean wait time is within 20% of the computed W_q. If divergence exceeds 20%, the workload is likely non-Poisson — surface a caveat noting M/M/1 assumptions broke. Sanity-check effective_throughput_rps <= concurrency * rate_limit_rps.
offered_load_rps == 0: emit a trivial brief stating "no impact, queue empty".c = 1 server with capacity burst_capacity; document the simplification.Other publishers' experience with this skill. Self-rating is blocked.
Ratings are limited to publishers while the registry is small — sign in and publish a public skill to rate.
No ratings yet. Be the first.
Same domains or capabilities as amitte/rate-limit-impact-modeler.
Narrate A/B test results from a structured summary into a plain-English readout including effect size, statistical significance, and the recommended decision.
Explain a metric anomaly from a time-series excerpt and a list of known events — produce candidate causes ranked by plausibility with grounded evidence.
Read-only AWS surface — list/describe EC2, S3 buckets, IAM users, and Lambda functions. Auth via STS-assumed role; no mutating tools.
Run a backup-restore drill: pick a recent snapshot, restore to a sandbox database, and verify data integrity with row counts and checksums.
Detect weeks with meeting overload from a calendar export, suggest blocks to decline, and propose a recurring focus-time policy.
Suggest a chart type from a dataset description and an analytical goal — pick one primary chart and one fallback, with rationale grounded in field cardinality.