Geo-Experimentation in Cassandra: Complete Guide

Overview

What are Geo-Experiments?

Geo-experiments are controlled tests designed to measure the true incremental impact of marketing activities by comparing test and control groups across different geographic areas.

The methodology works by:

Dividing geographic areas into two groups:
- Test Group: Areas where the marketing activity will be modified
- Control Group: Areas where marketing continues as usual
The algorithm selects these groups to be as similar as possible in terms of:
- Historical revenue patterns
- Seasonality
- Market size
- Consumer behavior
This similarity ensures that any external factors (seasonality, market conditions, other marketing activities) affect both groups equally, allowing us to isolate the specific impact of our test variable.

For example: If we see a 10% increase in the test group and a 5% increase in the control group during a high season, we can attribute the 5% difference to our marketing activity, as both groups were equally affected by seasonality.

Why are they Important?

Validate model assumptions about channel performance
Measure true incremental impact of marketing activities
Calibrate the MMM model with real experimental data
Inform budget allocation decisions with empirical evidence
Challenge or confirm platform-reported metrics

Data Requirements

Required Data Points

The experiment requires only two essential data points:

Date
Total output variable (e.g., Total Revenue, Total Orders)

<aside> 💡

Important: This should be total figures, NOT channel-attributed numbers

</aside>

Example: Use total daily revenue for each geography, not “revenue attributed to Meta ads”

Historical Data Requirements

Minimum historical data should be 4-5 times the intended test duration
- For a 1-month test: 4-5 months of historical data
- For a 2-week test: 8-10 weeks of historical data
Optimal scenario: 1 year of historical data when possible

Geographic Granularity

The platform has specific limitations on data volume to ensure reliable processing:

Maximum of ~40,000 total rows in the dataset
This limit applies to the combination of locations and time periods
Examples of viable combinations:
- 100 locations × 365 days
- 260 locations × 150 days
Best practices:
- Keep total locations under 260 to maintain ability to use 4-5 months of historical data
- Optimal scenario: ≤100 locations to allow for full year of historical data
- Avoid zip code level granularity in large markets as it often exceeds processing limits
- Use state/region level where possible

Experiment Design

Types of Geo-Experiments

Hold-out Tests
- Completely stop spending in test regions
- Measures baseline contribution of channel
- Best for validating channel incrementality
Scale-up Tests
- Increase spending in test regions
- Measures potential for growth
- Best for testing saturation points
New Channel Tests
- Test new channel in specific regions
- Measures incremental impact of new activity
- Best for validating expansion plans

Design Parameters

Duration: Typically 14-21 days minimum
Budget: Determined by the expected lift and ROI
Geography Selection: Algorithm selects regions to create comparable test/control groups
Expected Lift: Minimum detectable effect needed for statistical significance

Channel-Specific Considerations

When testing upper-funnel activities (e.g., Meta Awareness, YouTube), consider the delayed effect:

Example Scenario:

Test Duration: 3 weeks
Channel’s Known Lag Effect: 2 weeks
Analysis Approach:
1. First Analysis: At the end of 3-week test period
2. Final Analysis: At 5 weeks (3 weeks test + 2 weeks lag)
This ensures we capture the full impact, including delayed conversions

This is particularly important for:

Brand awareness campaigns
Video advertising
Content marketing
Other upper-funnel activities with known lag effects

Understanding Expected Lift

The expected lift shown in the experiment design represents:

The minimum change needed to validate the input ROI assumption
NOT a prediction of actual results
A threshold for statistical significance
Calculated based on:
Input ROI/ROAS
Historical performance
Geographic variance
Test duration

Example:

If input ROAS = 10
Expected Lift = -5%

This means: To validate a ROAS of 10, we need to see at least a 5% reduction in revenue when reducing spend. If we see less impact, it indicates the actual ROAS is lower than 10.

Running the Experiment

Implementation Steps

Select test regions based on Cassandra’s recommendations
Adjust campaign targeting/budgets accordingly
Monitor spend and delivery throughout test period
Maintain consistent tracking and measurement
Allow for additional time to measure delayed effects

Best Practices

Avoid major changes to other channels during test period
Consider seasonal effects and promotional calendar
Document any external factors that might impact results
Maintain test conditions for full duration
Monitor for any technical issues or tracking problems

Results Interpretation

Key Metrics

Measured lift/impact
Statistical significance
Incremental ROAS
Total attributed revenue
Confidence intervals

Understanding Results

Results show actual incremental impact vs. control group
Significance level indicates reliability of results (aim for 95%)
Confidence intervals show possible range of true effect
Compare actual lift to expected lift to validate ROI assumptions

Using Results

Model Calibration
- Input results into model calibration
- Update channel ROI assumptions
- Refine budget allocation recommendations
Strategic Planning
- Inform budget allocation decisions
- Guide channel strategy
- Validate or challenge existing assumptions