Building the engine while driving it
Standing up an experimentation program and PQL framework alongside a brand-new self-serve motion
Context
A growth-stage marketing automation platform — email and ads — serving SMB ecommerce merchants. Roughly 2,500 paying customers at an average of $400 per month, about $12M in ARR. The company was mid-transition from sales-led to product-led: a free trial had just launched as the new front door, and most of the sales organization was wound down, with a small remaining team shifting into a consultative model. I owned growth marketing for the self-serve funnel.
The tension
The trial was brand new, which meant everything downstream of it was undefined. No conversion benchmarks. Instrumentation half-built. Early signals messy and incomplete. And two open questions nobody had answered: how do we learn what makes this motion work, and now that reps aren't chasing every signup, who actually deserves human attention?
The culture was fail fast, learn fast, get back up — which was the right instinct, but instinct without a system produces motion, not learning.
The decision
The tempting move was to wait: let the new trial bake, collect a few clean quarters of baseline, then optimize from solid ground. I killed that option. In a transition, the cost of not learning beats the cost of imperfect data — waiting for clean baselines meant burning two quarters to learn nothing. Instead I proposed building the measurement and the motion in parallel: an experimentation program co-built with product and data partners, and a product-qualified lead framework that decided where consultant attention earned its cost.
Directional beats perfect. We acted on the data we had and improved the data while we acted.
The build
Experimentation. I owned the program; product, engineering, and data co-built the pipeline. We started with the prioritization model, because without one you test whatever's loudest in the room. Ours was deliberately simple — three dimensions: projected impact on a specific conversion metric, effort to get the test off the ground, and speed to learn, because when a motion is new, small fast tests beat big slow ones. Reach lived inside impact — a winning test on two percent of signups isn't a winning test. Every experiment ran the same cycle: hypothesis tied to a funnel metric, success thresholds set before launch so nobody moved goalposts, results documented in a shared learning library win or lose.
The first quarter was unglamorous — fixing event tracking, writing the test-design template, killing pet ideas that scored poorly. By the second quarter we were running nine experiments per quarter against a previous reality of a handful per year, and the wins were compounding because the losses were documented.
PQL framework. At $4,800 ACV, human touch pencils — but only where it changes the outcome. I pulled the early trial behavior we had and worked with our data analyst to find which actions separated converters from churners. Three first-week signals dominated. We built a simple scoring model on them and routed only threshold-crossing accounts to the consultant team, with the triggering behavior attached so every conversation opened with context instead of a cold pitch. Automation carried everyone else.
Toggle the first-week signals and watch the decision. Illustrative weights and threshold, altered per disclaimer — the mechanism is the point.
Score 0 · routing threshold 70
Automation keeps nurturing — human touch isn't earned yet.
The results
Over the program's first two quarters, trial-to-paid conversion moved from a 9.8% initial baseline to 12.5% — a 28% relative lift, driven mostly by three winning experiments on first-week activation steps. Lead-to-opportunity rate improved 37% under the PQL definition, and consultant acceptance of routed accounts hit 64%, against 31% when routing meant "trial signup with a pulse." That second number mattered as much as the conversion math — it made the consultative model credible to the people living it.
What didn't move: average revenue per account. The gains came from converting more of the users we already attracted, not from shifting plan mix. That told us where the next program needed to aim — expansion, not acquisition.
What I'd do differently
Start the learning library on day one, not month three. The experiments we ran before documentation existed had to be partially re-run, because institutional memory turned out to be exactly as reliable as you'd expect.