Best Practices for Incrementality Testing in Cassandra

Incrementality testing is essential for measuring the true impact of marketing activities beyond natural trends and external influences. This guide outlines best practices for running reliable experiments, interpreting results, and applying findings.

1. Understanding Incrementality Testing

What is Incrementality Testing?

Incrementality testing isolates the impact of marketing spend by comparing a test group (exposed to marketing) with a control group (not exposed) under similar conditions. The difference in outcomes between the two groups represents the incremental effect of marketing efforts.

Why is Incrementality Testing Important?

Validates model assumptions about channel performance.
Measures true impact of marketing beyond platform-reported metrics.
Calibrates MMM models with real-world experimental data.
Optimizes budget allocation based on empirical evidence.
Challenges or confirms platform-reported attribution.

2. Best Practices for Running Experiments

Types of Incrementality Experiments

Geo-Experiments (Geographic Holdouts)
- Divide markets into test and control regions.
- Adjust marketing spend in test regions while keeping control regions unchanged.
- Measure impact by comparing performance across regions.
Conversion Lift Studies (Platform-Based)
- Conducted within platforms like Meta, Google, and TikTok.
- Test and control groups are defined by the ad platform’s attribution system.
- Measures the incremental impact of ads based on conversion behavior.
Difference-in-Differences (DID) Experiments
- Used when external factors might influence both test and control groups.
- Compares performance before and after marketing changes.
Scale-Up Tests
- Increase spending in test regions and analyze whether incremental revenue aligns with expected ROI.
- Helps define saturation points for scaling budgets.
New Channel Introduction
- Launch a new channel or tactic in selected test regions.
- Compare performance to control regions where the channel is absent.

3. Data & Experiment Design

Data Requirements

To ensure accuracy, gather:

Date: Daily or weekly timestamps.
Outcome KPI: Total revenue, orders, or conversions.
Geo-Granularity: City, state, or DMA-level data.

Historical Data Requirements

Minimum 4-5× the test duration (e.g., for a 2-week test, use 8-10 weeks of historical data).
Optimal scenario: 1 year of historical data when available.

Geographic Granularity Considerations

Avoid zip-code-level granularity in large markets (too fragmented).
Optimal granularity: State or region level for stable test-control comparisons.
Ensure at least 100-250 locations for robust statistical significance.

Designing a Reliable Experiment

Duration: At least 14-21 days to capture full impact.
Budget: Align with expected lift and statistical significance.
Geography Selection: Use Cassandra’s algorithm to select comparable test/control groups.
Lag Effects: Consider delayed conversions (e.g., upper-funnel campaigns may take weeks to show results).

4. Running the Experiment

Implementation Steps

Select test regions using Cassandra’s recommendations.
Adjust campaign targeting or budgets accordingly.
Monitor campaign execution to ensure accurate spend and reach.
Maintain consistent tracking for total conversions (not just ad-attributed ones).
Allow additional time for lag effects before final analysis.

Best Practices

Avoid major changes to other marketing channels during the test period.
Consider seasonal factors and promotions that may skew results.
Document external factors that could impact results (economic events, competitor actions).
Maintain test conditions for the entire duration without premature optimizations.

5. Interpreting & Applying Results

Key Metrics for Analysis

Measured Lift: Percentage difference between test and control groups.
Statistical Significance: Confidence level (aim for 95%).
Incremental ROAS (iROAS): Revenue lift divided by additional spend.
Total Incremental Revenue: Attributed revenue beyond baseline.
Confidence Intervals: Range of estimated impact, indicating certainty.

Understanding Your Results

If measured lift is significant, the tested channel or tactic has a proven impact.
If confidence intervals are wide, results may be inconclusive (consider longer tests or more data).
If measured lift is lower than expected, reconsider assumptions about channel ROI.

Using Results for Model Calibration

Input findings into Cassandra’s Calibration Panel.
Adjust channel ROI assumptions based on experimental results.
Re-run MMM models with updated priors to refine future predictions.
Use results to refine budget allocation and optimization strategies.

6. Common Pitfalls & How to Avoid Them

Pitfall	Solution
Test/control groups are imbalanced	Use algorithmic matching based on historical performance.
External factors influence results	Document and adjust for promotions, economic changes, etc.
Sample size is too small	Expand the number of test locations or extend test duration.
Measuring too soon	Consider delayed effects, especially for brand campaigns.
Platform-reported data is misleading	Compare results with total business outcomes (not just platform attribution).

7. Summary & Next Steps

Use geo-experiments or conversion lift studies to measure true incremental impact.
Ensure strong test-control matching to eliminate bias.
Interpret results carefully, considering confidence intervals and statistical significance.
Use results to calibrate Cassandra models and refine future budget allocation.
Avoid common pitfalls like small sample sizes, external disruptions, and incorrect attribution.