1. Knowledge base
  2. Experimentation & Validation

Geo-Experimentation in Cassandra: Complete Guide

Overview

What are Geo-Experiments?

Geo-experiments are controlled tests designed to measure the true incremental impact of marketing activities by comparing test and control groups across different geographic areas.

The methodology works by:

  1. Dividing geographic areas into two groups:
    • Test Group: Areas where the marketing activity will be modified
    • Control Group: Areas where marketing continues as usual
  2. The algorithm selects these groups to be as similar as possible in terms of:
    • Historical revenue patterns
    • Seasonality
    • Market size
    • Consumer behavior
  3. This similarity ensures that any external factors (seasonality, market conditions, other marketing activities) affect both groups equally, allowing us to isolate the specific impact of our test variable.

For example: If we see a 10% increase in the test group and a 5% increase in the control group during a high season, we can attribute the 5% difference to our marketing activity, as both groups were equally affected by seasonality.

Why are they Important?

  • Validate model assumptions about channel performance
  • Measure true incremental impact of marketing activities
  • Calibrate the MMM model with real experimental data
  • Inform budget allocation decisions with empirical evidence
  • Challenge or confirm platform-reported metrics

Data Requirements

Required Data Points

The experiment requires only two essential data points:

  • Date
  • Total output variable (e.g., Total Revenue, Total Orders)

<aside> 💡

Important: This should be total figures, NOT channel-attributed numbers

</aside>

Example: Use total daily revenue for each geography, not “revenue attributed to Meta ads”

Historical Data Requirements

  • Minimum historical data should be 4-5 times the intended test duration
    • For a 1-month test: 4-5 months of historical data
    • For a 2-week test: 8-10 weeks of historical data
  • Optimal scenario: 1 year of historical data when possible

Geographic Granularity

The platform has specific limitations on data volume to ensure reliable processing:

  • Maximum of ~40,000 total rows in the dataset
  • This limit applies to the combination of locations and time periods
  • Examples of viable combinations:
    • 100 locations × 365 days
    • 260 locations × 150 days
  • Best practices:
    • Keep total locations under 260 to maintain ability to use 4-5 months of historical data
    • Optimal scenario: ≤100 locations to allow for full year of historical data
    • Avoid zip code level granularity in large markets as it often exceeds processing limits
    • Use state/region level where possible

Experiment Design

Types of Geo-Experiments

  1. Hold-out Tests
    • Completely stop spending in test regions
    • Measures baseline contribution of channel
    • Best for validating channel incrementality
  2. Scale-up Tests
    • Increase spending in test regions
    • Measures potential for growth
    • Best for testing saturation points
  3. New Channel Tests
    • Test new channel in specific regions
    • Measures incremental impact of new activity
    • Best for validating expansion plans

Design Parameters

  • Duration: Typically 14-21 days minimum
  • Budget: Determined by the expected lift and ROI
  • Geography Selection: Algorithm selects regions to create comparable test/control groups
  • Expected Lift: Minimum detectable effect needed for statistical significance

Channel-Specific Considerations

When testing upper-funnel activities (e.g., Meta Awareness, YouTube), consider the delayed effect:

Example Scenario:

  • Test Duration: 3 weeks

  • Channel’s Known Lag Effect: 2 weeks

  • Analysis Approach:

    1. First Analysis: At the end of 3-week test period
    2. Final Analysis: At 5 weeks (3 weeks test + 2 weeks lag)

    This ensures we capture the full impact, including delayed conversions

This is particularly important for:

  • Brand awareness campaigns
  • Video advertising
  • Content marketing
  • Other upper-funnel activities with known lag effects

Understanding Expected Lift

The expected lift shown in the experiment design represents:

  • The minimum change needed to validate the input ROI assumption
  • NOT a prediction of actual results
  • A threshold for statistical significance
  • Calculated based on:
  • Input ROI/ROAS
  • Historical performance
  • Geographic variance
  • Test duration

Example:

If input ROAS = 10
Expected Lift = -5%

This means: To validate a ROAS of 10, we need to see at least a 5% reduction in revenue when reducing spend. If we see less impact, it indicates the actual ROAS is lower than 10.

Running the Experiment

Implementation Steps

  1. Select test regions based on Cassandra’s recommendations
  2. Adjust campaign targeting/budgets accordingly
  3. Monitor spend and delivery throughout test period
  4. Maintain consistent tracking and measurement
  5. Allow for additional time to measure delayed effects

Best Practices

  • Avoid major changes to other channels during test period
  • Consider seasonal effects and promotional calendar
  • Document any external factors that might impact results
  • Maintain test conditions for full duration
  • Monitor for any technical issues or tracking problems

Results Interpretation

Key Metrics

  • Measured lift/impact
  • Statistical significance
  • Incremental ROAS
  • Total attributed revenue
  • Confidence intervals

Understanding Results

  • Results show actual incremental impact vs. control group
  • Significance level indicates reliability of results (aim for 95%)
  • Confidence intervals show possible range of true effect
  • Compare actual lift to expected lift to validate ROI assumptions

Using Results

  1. Model Calibration
    • Input results into model calibration
    • Update channel ROI assumptions
    • Refine budget allocation recommendations
  2. Strategic Planning
    • Inform budget allocation decisions
    • Guide channel strategy
    • Validate or challenge existing assumptions