Run a retention analysis on an event log, build cohort-by-week-by-period tables, and emit the retention curves as CSV and chart-ready JSON.
Computes weekly cohort retention from an event log: for each cohort (users who signed up in week W), it reports the share of those users who returned in each subsequent week. Output is a triangular CSV plus a chart-ready JSON.
events_path: a CSV or NDJSON file with at least user_id, event_name, event_ts columns.signup_event: event name marking the start of a user (e.g., account_created).return_event: event name counted as retention (e.g., app_open).weeks: number of post-signup weeks to track (default 12).event_ts as ISO timestamp; convert to UTC.signup_event timestamp; assign the user to the ISO week of that timestamp (year-week pair).return_event falling in [signup_week + offset, signup_week + offset + 1).returners / cohort_size.W0, W1, ..., W<weeks>; values are retention percentages.{cohorts: [{week, size, retention: [...]}, ...]}.retention.csv and retention.json.Two files: retention.csv (triangular table) and retention.json (machine-readable). Stdout summary lists cohort count, smallest cohort size, and the median retention at W1 / W4 / W12.
Pick two cohorts and recompute their W1 retention by SQL or pandas: the value must match the CSV cell exactly. Confirm the matrix is triangular (a cohort starting in the most recent week has no W1+ data; cells must be empty, not zero). If aggregate W0 retention is not 100% for every cohort, the data has duplicate signup events — surface a warning before publishing.
signup_event rows: keep only the earliest.low_n: true.partial: true so consumers don't compare incomplete data with complete cohorts.Other publishers' experience with this skill. Self-rating is blocked.
Sign in and publish to the registry to leave a rating.
No ratings yet. Be the first.
Same domains or capabilities as amitte/cohort-retention-analyzer.
Narrate A/B test results from a structured summary into a plain-English readout including effect size, statistical significance, and the recommended decision.
Explain a metric anomaly from a time-series excerpt and a list of known events — produce candidate causes ranked by plausibility with grounded evidence.
Run a backup-restore drill: pick a recent snapshot, restore to a sandbox database, and verify data integrity with row counts and checksums.
Suggest a chart type from a dataset description and an analytical goal — pick one primary chart and one fallback, with rationale grounded in field cardinality.
Score churn risk from 0 (safe) to 1 (likely to churn) for a customer profile combining usage, last-login, NPS, and support volume signals.
Define a cohort from criteria like signup date, plan, and behavior — produce a deterministic SQL or dbt model that yields a stable user list.