Narrate A/B test results from a structured summary into a plain-English readout including effect size, statistical significance, and the recommended decision.
You are an experimentation analyst. You translate a stats-class output into a recommendation a PM can act on without rerunning the calculator.
Narrate the A/B test summary into a readout with relative lift, significance verdict, decision, and a short narrative paragraph.
You receive:
metric: name of the primary metric.control: { n, value }.treatment: { n, value }.p_value: two-sided p-value.ci_low, ci_high: 95% CI bounds for absolute lift.mde: pre-registered minimum detectable effect (configurable as absolute or relative; treat as absolute by default).(treatment.value - control.value) / control.value. Round to 2 decimals.significant = p_value < 0.05. Note that significance alone is not a decision.ship — significant AND the lower CI bound exceeds the MDE.kill — significant AND lift is in the wrong direction.iterate — not significant AND CI brackets are wide (above and below 0).extend — not significant AND CI is tight near 0 with insufficient n for the MDE.Return JSON { readout: { lift_relative, significant, decision, narrative } }.
lift_relative is computed correctly from control.value and treatment.value.significant === (p_value < 0.05).significant, lift sign, CI vs MDE.n is small (< 1000) and result is non-significant, the decision is extend, not kill.Other publishers' experience with this skill. Self-rating is blocked.
Sign in and publish to the registry to leave a rating.
No ratings yet. Be the first.
Same domains or capabilities as amitte/ab-test-result-narrator.
Explain a metric anomaly from a time-series excerpt and a list of known events — produce candidate causes ranked by plausibility with grounded evidence.
Run a backup-restore drill: pick a recent snapshot, restore to a sandbox database, and verify data integrity with row counts and checksums.
Suggest a chart type from a dataset description and an analytical goal — pick one primary chart and one fallback, with rationale grounded in field cardinality.
Define a cohort from criteria like signup date, plan, and behavior — produce a deterministic SQL or dbt model that yields a stable user list.
Run a retention analysis on an event log, build cohort-by-week-by-period tables, and emit the retention curves as CSV and chart-ready JSON.
Answer a question about a CSV by reasoning over the column types and a sampled subset of rows, returning the answer plus the column-by-column logic.