Skip to content
All Projects
Case Study

A/B Testing for Free Trial Optimization & Retention

Designed and analyzed an A/B test comparing a control (current free trial flow) against a treatment (time-commitment screener) to reduce cancellations and improve retention. Evaluated click-through, gross conversion, and net conversion using t-tests and confidence intervals, performed sanity checks on invariant metrics, and recommended launching based on statistically and practically significant retention improvements.

OrganizationNew York University
RoleLead Analyst
TimelineAug – Nov 2024
Reading Time3 min read
A/B TestingT-TestsConfidence IntervalsExperimentationRetention Analysis

Overview

Designed and analyzed an A/B test to evaluate whether adding a time-commitment screener to a free trial signup flow could reduce cancellations and improve user retention. The experiment compared the current trial experience (control) against a modified flow that set clearer expectations about the time investment required.

Problem

Free trial funnels often suffer from high cancellation rates — users sign up, realize the commitment is more than expected, and churn quickly. The hypothesis was that a screener surfacing the time commitment upfront would filter out low-intent signups, leading to better retention among users who proceed.

The challenge: this screener could also reduce gross conversion (fewer people start the trial), so we needed to measure whether the net retention improvement justified the top-of-funnel cost.

Approach

Experimental design:

  • Control: current free trial flow (no screener)
  • Treatment: time-commitment screener added before trial activation
  • Randomization at the user level with proper unit of diversion

Key metrics:

  • Click-through rate (engagement with the screener)
  • Gross conversion (trial starts / pageviews)
  • Net conversion (retained users / pageviews) — the primary decision metric

Sanity checks on invariant metrics — confirmed randomization integrity by verifying that metrics expected to be unaffected (e.g., total pageviews per group) showed no statistically significant difference between control and treatment.

Statistical analysis:

  • T-tests and confidence intervals for each metric
  • Distinguished statistical significance (is the effect real?) from practical significance (is it large enough to matter operationally?)
  • Assessed effect sizes relative to business thresholds

Results

  • Gross conversion decreased (as expected — the screener filtered out some users)
  • Net conversion improved — users who proceeded through the screener retained at a meaningfully higher rate
  • The retention improvement was both statistically significant and practically significant relative to business thresholds
  • Sanity checks on invariant metrics confirmed clean randomization

Recommendation

Recommended launching the treatment based on the net retention improvement. Proposed follow-up experiments to address residual early cancellations — specifically testing variations in screener messaging and trial length to further optimize the conversion-retention tradeoff.

Lessons Learned

  • Sanity checks on invariant metrics are non-negotiable — without them, you can't distinguish a real treatment effect from a randomization failure
  • Statistical significance alone isn't sufficient for launch decisions — practical significance relative to business cost structure matters more
  • A/B tests that reduce top-of-funnel volume can still be net-positive if they improve downstream retention enough to offset the loss