We're not a large company. Is ChurnAI built for us?

ChurnAI works for B2B SaaS teams managing 20 or more accounts. Most teams see their first risk scores within 48 hours. You do not need a data warehouse to start. Stripe and Salesforce alone is enough. CS teams at Series A through Series C use it, and the scoring model adapts to your data volume. If your team runs manual health checks in spreadsheets today, this replaces that workflow on day one.

What if we connect and see no results?

Within 48 hours, you see your real accounts with real scores. Each flag has a specific reason: usage trend, billing event, or support pattern. CS teams run their first save calls in week one. If ChurnAI connects to your data and produces no actionable signal, we extend onboarding at no cost until it does.

How is ChurnAI different from ChurnZero or Gainsight?

Every competitor sends your data to their cloud. ChurnAI runs in yours. Zero rows reach our servers. ChurnZero and Gainsight have no product team view, no cohort-to-revenue analysis, and go-live timelines measured in months. Seat pricing means your cost scales with headcount, not results. We built the tool we wished existed when we were doing security reviews against those vendors.

When will we see results?

Most teams see their first ranked risk list within 48 hours of connecting their data. CS teams run their first save calls in week one. A retention team we onboarded closed 3 at-risk renewals in 7 days. ChurnAI flagged those accounts 10 days before their CSMs noticed. The 48-hour onboarding is a hard commitment.

What's the real cost: setup fees, data migration, hidden pricing?

No setup fee. No platform migration. No seat-based pricing surprises. You pay a flat monthly fee based on your account volume. A ChurnAI engineer handles the deployment inside your cloud. You do not touch infrastructure. We built it self-hosted because every vendor we competed against failed the security conversation. That architecture closes deals other tools cannot.

← All posts·health-scores·6 MIN READ·July 6, 2026

Why most customer health scores are decoration

Name: ChurnAI
Brand: ChurnAI
Price: 149 USD
Rating: 4.8 (12 reviews)

A health score nobody validates against renewal outcomes is dashboard furniture. Here is the quarterly backtest that tells you if yours actually predicts anything.

THE SHORT ANSWER

Most customer health scores fail because nobody validates them against renewal outcomes. The score gets built once, turns red and green on a dashboard, and never changes a decision. The fix is a quarterly backtest: pull last quarter's churned accounts, check what their score said 60 and 90 days before cancellation, then do the same for renewals. If the scores do not separate the groups, it is decoration.

What makes a health score decoration instead of a tool

A health score is decoration when it exists but changes nothing. You can spot one in about two minutes with three questions:

When did the score last cause someone to make a call, send an email, or change a renewal forecast?
Has anyone ever checked whether accounts that churned actually had low scores before they left?
Can anyone on the team explain why a specific account is red right now?

If the answers are "not sure," "no," and "the formula does something with logins," you have decoration. The score was built in a spreadsheet or a CS tool during a planning cycle, everyone felt good about it for a month, and then it became wallpaper.

This is common because building a score feels like progress and validating one feels like homework. The building part gets a kickoff meeting. The validation part gets nothing, because no calendar invite says "check whether our score predicted the churn we just ate."

The cost is not neutral. A wrong score is worse than no score. It tells you the account that is about to cancel is healthy, so you skip the call that might have saved them, and it sends you to "rescue" accounts that were never leaving.

How do you validate a customer health score?

You backtest it against outcomes you already have. Churn gives you a labeled dataset for free: every account that cancelled last quarter is a test case, and so is every account that renewed. You do not need a data scientist. You need a spreadsheet and an honest hour.

The method:

List last quarter's churned accounts. Every cancellation and every non-renewal. For most companies with 50 to 500 accounts this is a list of 3 to 15 names.
Look up what their health score said 60 and 90 days before they cancelled. Not the day they cancelled. By cancellation day everyone knows. The score's job is to warn you while there is still time to act, which for B2B renewals means two to three months out.
Do the same for a sample of renewals. Pull 10 to 20 accounts that renewed in the same quarter and record their scores at the same 60 and 90 day marks before renewal.
Compare the two groups. Churned accounts should have been meaningfully redder than renewed accounts at those checkpoints. Count how many churned accounts were flagged red or yellow, and how many renewed accounts were flagged red.

That comparison gives you two numbers worth writing down:

| Measure | Question it answers | Bad sign | | --- | --- | --- | | Catch rate | Of the accounts that churned, how many did the score flag 60+ days out? | Below half were flagged | | False alarm rate | Of the accounts flagged red, how many actually churned or downgraded? | Most red accounts renewed fine | | Explainability | Can you say why each flagged account was flagged? | The reason is "the formula" |

If your historical scores were never snapshotted, that is itself the first finding: a score you cannot look up retroactively cannot be validated, so start snapshotting it weekly (a scheduled export to a spreadsheet is enough) and run the backtest next quarter.

60-90days before cancellation your score must be red to matter

Why a score that flags everyone is as useless as one that flags no one

There are two ways for a score to fail the backtest, and they feel very different but cost the same.

The first failure is silence: churned accounts sat at green until the cancellation email arrived. This usually means the score is built on lagging or vanity inputs, things like NPS responses from two quarters ago, whether a QBR happened, or account size. Those describe the relationship's paperwork, not its behavior.

The second failure is noise: the score flags a third of your book as red every week. Nobody can work a list that long, so the team learns to ignore red, and the one genuinely dying account is invisible inside the crowd. A fire alarm that goes off every day protects nothing.

The test for noise is simple: divide your red accounts by the number of save conversations your team can actually run in a week. If the red list is bigger than your capacity to act on it, the threshold is wrong or the inputs are wrong, and the score is generating anxiety instead of decisions.

A working health score produces a short list. If you have 200 accounts, a useful red list is 5 to 15 accounts, each with a stated reason. If your score cannot produce that, fix the score before you fix the accounts.

The "reason attached" requirement

Every flagged account needs a reason a human can read and act on. "Health: 47" is not a reason. "Usage down 40 percent since March and the champion has not answered three emails" is a reason, and it also tells you what the save call is about.

This requirement does real work in two directions:

It makes the score actionable. A CSM or founder looking at a red account with a reason knows what to say in the first sentence of the outreach. A red account without a reason gets the generic "just checking in" email, which is the same as no outreach.
It keeps the score honest. If you cannot attach a reason, the score is combining signals in a way nobody understands, which means nobody will trust it, which means nobody will act on it, which puts you back at decoration.

When you run the quarterly backtest, check reasons too: for the churned accounts the score did flag, was the attached reason the actual reason they left? A score that flags the right accounts for the wrong reasons will eventually flag the wrong accounts.

When to simplify: fewer signals beat clever weights

The instinct after a failed backtest is to add sophistication: more inputs, decimal weights, maybe a request to the data team for a model. Resist it. At 50 to 500 accounts, you do not have enough churn events per quarter to tune a complicated model, and a weighting scheme nobody understands fails the reason-attached requirement automatically.

The better move is usually subtraction. Take the backtest results and ask which individual signals actually separated churned accounts from renewed ones. In most books it is a short list: usage trend, champion responsiveness, and payment or contract behavior tend to carry nearly all the signal. Rebuild the score on the three to five inputs that demonstrably worked, with simple thresholds, and drop the rest.

A three-signal score everyone trusts and acts on beats a fifteen-signal score everyone ignores. The measure of a health score is decisions changed per quarter, not inputs consumed.

The one-hour quarterly validation ritual

Put a recurring 60-minute block on the calendar for the first week of each quarter. Here is the agenda:

| Minutes | Step | | --- | --- | | 0-10 | List last quarter's churned and downgraded accounts | | 10-25 | Record each one's health score 60 and 90 days pre-cancellation | | 25-35 | Pull 10-20 renewed accounts and record their scores at the same marks | | 35-45 | Compute catch rate and false alarm rate; compare to last quarter | | 45-55 | For each miss, name the signal that would have caught it | | 55-60 | Change one thing: add that signal, cut a dead one, or move a threshold |

Two rules make the ritual stick. First, change at most one or two things per quarter, so next quarter's backtest tells you whether the change helped. Second, write the two rates somewhere visible. A score whose catch rate is improving quarter over quarter is a tool being sharpened. A score with no recorded history is decoration, whatever the dashboard says.

Do this four times and you will have something rare: a health score with a track record, which is the only kind worth trusting your renewal forecast to.

〉 NEXT STEP

See which of your accounts are at risk right now

ChurnAI connects to your data and produces a ranked risk list within 48 hours. No data leaves your cloud.

Score my accounts free

Related resources

RELATED

01 ·

Customer health score: how to build one that predicts renewals

A step-by-step method for building a health score from the four signal families you already track, and the validation test most scores fail.