Tier is admin-set; reputation is shown for transparency. Updated 6/8/2026.
Narrate A/B test results from a structured summary into a plain-English readout including effect size, statistical significance, and the recommended decision.
Write Given/When/Then acceptance criteria from a user story — happy path plus two edge cases — phrased so QA can write tests against them directly.
Write five Google Ads responsive search ad copy variants from a product description and audience — each fits headline length and includes a distinct CTA.
Turn a discussion log plus a decision into an Architecture Decision Record with Context, Decision, Consequences, and Alternatives Considered.
Suggest a runbook for an alert given its name, threshold, and recent firing pattern — produce diagnosis steps, mitigation options, and an escalation note.
Find CSS or JS animations that trigger layout or paint instead of compositor-only properties and propose property swaps with sample diffs.
Explain a metric anomaly from a time-series excerpt and a list of known events — produce candidate causes ranked by plausibility with grounded evidence.
Diff two OpenAPI YAML files and produce a backwards-compatibility changelog grouped into breaking, non-breaking, and additive changes.
Audit an AWS IAM policy against CloudTrail usage data and propose a minimized policy listing only actions actually invoked in the analysis window.
Read-only AWS surface — list/describe EC2, S3 buckets, IAM users, and Lambda functions. Auth via STS-assumed role; no mutating tools.
Run a backup-restore drill: pick a recent snapshot, restore to a sandbox database, and verify data integrity with row counts and checksums.
Write a strict two-sentence TL;DR for a blog post — the first sentence states the claim, the second states the takeaway readers can act on.
Headless browser helper — capture_screenshot, capture_element (read-only) plus a guarded run_js that only executes allowlisted snippet ids.
Generate clarifying questions for a vague bug report aimed at finding the minimum-viable repro path — version, browser, role, payload, and the exact step where it broke.
Read-only RubyGems helper — search_gems, get_gem_info, list_versions. Surface for Ruby dependency discovery from an agent.
Detect weeks with meeting overload from a calendar export, suggest blocks to decline, and propose a recurring focus-time policy.
Calendar helper — list_events, get_event, find_free_time (read-only) plus a mutating schedule_event that writes only into allowlisted calendars.
Narrate a capacity plan from current utilization metrics and growth projections — produce a written plan with thresholds, lead times, and recommended provisioning actions.
Read-only crates.io helper — search_crates, get_crate_info, list_versions. Surface for Rust dependency discovery from an agent.
Write a 600 to 800 word case study from a customer interview transcript structured as Problem, Solution, Impact, with named metrics and a single pull-quote.
Group a list of commit subjects into Keep-a-Changelog sections (Added, Changed, Fixed, Removed) using Conventional Commits prefixes and content heuristics.
Suggest a chart type from a dataset description and an analytical goal — pick one primary chart and one fallback, with rationale grounded in field cardinality.
Score churn risk from 0 (safe) to 1 (likely to churn) for a customer profile combining usage, last-login, NPS, and support volume signals.
Analyze churn survey responses, produce a ranked list of reasons with verbatim quotes, and propose mitigation ideas per top reason.
Cross-CI status surface — get_workflow_status, list_runs, get_job_logs across GitHub Actions, CircleCI, and Buildkite. Read-only.
Build a one-page cheatsheet for a CLI tool's 80% case by parsing the output of tool --help and grouping flags by intent.
Read-only Cloudflare surface — list zones, DNS records, deployed Workers, and page rules. Auth via scoped API token; no mutating tools.
Identify imports and module-init code that contribute to Cloudflare Worker cold starts and propose lazy-load rewrites.
Read-only AWS CloudWatch surface — query_logs (Logs Insights), get_metric_data, list_log_groups. Auth via STS-assumed role.
Critique a unified diff with concrete, line-anchored suggestions; flag bugs, performance regressions, and naming or style problems with severity tags.
Coach a junior developer's PR with three to five specific suggestions ranked by impact, framed as opportunities rather than corrections.
Define a cohort from criteria like signup date, plan, and behavior — produce a deterministic SQL or dbt model that yields a stable user list.
Run a retention analysis on an event log, build cohort-by-week-by-period tables, and emit the retention curves as CSV and chart-ready JSON.
Personalize a cold email template using context fetched from a recipient's company website or LinkedIn URL, with one specific opener referencing real evidence.
Rewrite a cold email for higher reply rate — references one specific detail about the recipient, makes one ask, and closes with a single low-friction next step.
Personalize cold outreach by referencing two or three concrete details about the recipient's company and connecting them to one specific product capability.
Rewrite a freeform commit message into the Conventional Commits format with a typed subject, a wrapped body, and a footer for issue refs and breaking changes.
Explain compensation band logic to a candidate given the band, their offered level, and the company's banding philosophy — produce a clear, defensible message.
Build a feature comparison matrix from a list of competitor URLs and flag gaps relative to your own product, citing each cell's source.
Write competitive positioning against a named competitor based on a feature-parity grid and one or two genuine differentiators — never disparage, always anchor in evidence.
Map a SOC2 or ISO 27001 control to evidence artifacts in a typical engineering org — produce a list of artifacts, owners, and the query or path that produces each.
Document every env var declared in .env.example and config files, cross-referenced to the code locations that read each one.
Scan a container image with Trivy or Grype and surface fixes ranked by exploitability and patch availability.
Generate a quarterly content calendar from themes and a target cadence — produce dated entries with format, theme, and one-line angle each.
Read a Lighthouse or CrUX report and narrate LCP, CLS, and INP scores into a prioritized fix plan with concrete code suggestions.
Audit a CORS configuration for over-permissive Origin, Methods, and Headers and propose a tightened policy keyed to actual cross-origin call patterns.
Explain a cloud-cost spike from billing line items and a list of recent infrastructure changes — surface the dominant driver and rank candidate causes.
Personalize a cover letter for a specific job by referencing one or two details from the JD and tying them to the candidate's most relevant accomplishment.
Read a list of crontab specifications and detect overlapping execution windows that risk resource contention or duplicate work.
Tighten a Content-Security-Policy by stripping wildcards and verifying the result against actual page resource loads observed in browser logs.