8 MCP servers, 24 agents and 18 skills, shipped together with one signature.
Tools an agent host can call into.
Read-only AWS surface — list/describe EC2, S3 buckets, IAM users, and Lambda functions. Auth via STS-assumed role; no mutating tools.
Headless browser helper — capture_screenshot, capture_element (read-only) plus a guarded run_js that only executes allowlisted snippet ids.
Read-only RubyGems helper — search_gems, get_gem_info, list_versions. Surface for Ruby dependency discovery from an agent.
Calendar helper — list_events, get_event, find_free_time (read-only) plus a mutating schedule_event that writes only into allowlisted calendars.
Read-only crates.io helper — search_crates, get_crate_info, list_versions. Surface for Rust dependency discovery from an agent.
Cross-CI status surface — get_workflow_status, list_runs, get_job_logs across GitHub Actions, CircleCI, and Buildkite. Read-only.
Read-only Cloudflare surface — list zones, DNS records, deployed Workers, and page rules. Auth via scoped API token; no mutating tools.
Read-only AWS CloudWatch surface — query_logs (Logs Insights), get_metric_data, list_log_groups. Auth via STS-assumed role.
A2A endpoints or packaged prompts that take a goal.
Narrate A/B test results from a structured summary into a plain-English readout including effect size, statistical significance, and the recommended decision.
Markdown playbooks an agent reads.
Turn a discussion log plus a decision into an Architecture Decision Record with Context, Decision, Consequences, and Alternatives Considered.
Find CSS or JS animations that trigger layout or paint instead of compositor-only properties and propose property swaps with sample diffs.
Diff two OpenAPI YAML files and produce a backwards-compatibility changelog grouped into breaking, non-breaking, and additive changes.
Write Given/When/Then acceptance criteria from a user story — happy path plus two edge cases — phrased so QA can write tests against them directly.
Write five Google Ads responsive search ad copy variants from a product description and audience — each fits headline length and includes a distinct CTA.
Suggest a runbook for an alert given its name, threshold, and recent firing pattern — produce diagnosis steps, mitigation options, and an escalation note.
Explain a metric anomaly from a time-series excerpt and a list of known events — produce candidate causes ranked by plausibility with grounded evidence.
Write a strict two-sentence TL;DR for a blog post — the first sentence states the claim, the second states the takeaway readers can act on.
Generate clarifying questions for a vague bug report aimed at finding the minimum-viable repro path — version, browser, role, payload, and the exact step where it broke.
Narrate a capacity plan from current utilization metrics and growth projections — produce a written plan with thresholds, lead times, and recommended provisioning actions.
Write a 600 to 800 word case study from a customer interview transcript structured as Problem, Solution, Impact, with named metrics and a single pull-quote.
Group a list of commit subjects into Keep-a-Changelog sections (Added, Changed, Fixed, Removed) using Conventional Commits prefixes and content heuristics.
Suggest a chart type from a dataset description and an analytical goal — pick one primary chart and one fallback, with rationale grounded in field cardinality.
Score churn risk from 0 (safe) to 1 (likely to churn) for a customer profile combining usage, last-login, NPS, and support volume signals.
Critique a unified diff with concrete, line-anchored suggestions; flag bugs, performance regressions, and naming or style problems with severity tags.
Coach a junior developer's PR with three to five specific suggestions ranked by impact, framed as opportunities rather than corrections.
Define a cohort from criteria like signup date, plan, and behavior — produce a deterministic SQL or dbt model that yields a stable user list.
Rewrite a cold email for higher reply rate — references one specific detail about the recipient, makes one ask, and closes with a single low-friction next step.
Personalize cold outreach by referencing two or three concrete details about the recipient's company and connecting them to one specific product capability.
Rewrite a freeform commit message into the Conventional Commits format with a typed subject, a wrapped body, and a footer for issue refs and breaking changes.
Explain compensation band logic to a candidate given the band, their offered level, and the company's banding philosophy — produce a clear, defensible message.
Write competitive positioning against a named competitor based on a feature-parity grid and one or two genuine differentiators — never disparage, always anchor in evidence.
Map a SOC2 or ISO 27001 control to evidence artifacts in a typical engineering org — produce a list of artifacts, owners, and the query or path that produces each.
Generate a quarterly content calendar from themes and a target cadence — produce dated entries with format, theme, and one-line angle each.
Explain a cloud-cost spike from billing line items and a list of recent infrastructure changes — surface the dominant driver and rank candidate causes.
Personalize a cover letter for a specific job by referencing one or two details from the JD and tying them to the candidate's most relevant accomplishment.
Audit an AWS IAM policy against CloudTrail usage data and propose a minimized policy listing only actions actually invoked in the analysis window.
Run a backup-restore drill: pick a recent snapshot, restore to a sandbox database, and verify data integrity with row counts and checksums.
Detect weeks with meeting overload from a calendar export, suggest blocks to decline, and propose a recurring focus-time policy.
Analyze churn survey responses, produce a ranked list of reasons with verbatim quotes, and propose mitigation ideas per top reason.
Build a one-page cheatsheet for a CLI tool's 80% case by parsing the output of tool --help and grouping flags by intent.
Identify imports and module-init code that contribute to Cloudflare Worker cold starts and propose lazy-load rewrites.
Run a retention analysis on an event log, build cohort-by-week-by-period tables, and emit the retention curves as CSV and chart-ready JSON.
Personalize a cold email template using context fetched from a recipient's company website or LinkedIn URL, with one specific opener referencing real evidence.
Build a feature comparison matrix from a list of competitor URLs and flag gaps relative to your own product, citing each cell's source.
Document every env var declared in .env.example and config files, cross-referenced to the code locations that read each one.
Scan a container image with Trivy or Grype and surface fixes ranked by exploitability and patch availability.
Read a Lighthouse or CrUX report and narrate LCP, CLS, and INP scores into a prioritized fix plan with concrete code suggestions.
Audit a CORS configuration for over-permissive Origin, Methods, and Headers and propose a tightened policy keyed to actual cross-origin call patterns.
Read a list of crontab specifications and detect overlapping execution windows that risk resource contention or duplicate work.
Tighten a Content-Security-Policy by stripping wildcards and verifying the result against actual page resource loads observed in browser logs.