Most Google Ads “optimizations” are guesses. Someone on your team — or at your agency — looks at a metric that moved, forms an opinion about why, makes a change, and then watches the next two weeks to see if it gets better or worse. If it gets better, they take credit. If it gets worse, they blame seasonality.
That’s not optimization. That’s superstition with a dashboard.
Google Ads experiments exist precisely to fix this problem. They let you split your campaign traffic between a control and a variant, isolate one variable at a time, and get a statistically meaningful answer before you commit your entire budget to a change. The feature has been in Google Ads for years, and most advertisers either don’t know it exists or use it wrong. This article walks you through how to use it correctly — from setup to reading results to knowing when the data is actually telling you something.
- Google Ads experiments let you A/B test bidding strategies, ad copy, landing pages, and match types against a live control — without disrupting your whole campaign.
- Setting a clear, single-variable hypothesis before you run an experiment is what separates useful data from noise.
- You need enough traffic for statistical significance — experiments on low-volume campaigns will mislead you more than help you.
- The drafts and experiments workflow in Google Ads lets you stage changes before applying them, which is underused even by experienced advertisers.
- Applying a winning experiment takes one click — but knowing when it’s actually a winner requires understanding confidence thresholds, not just directional improvement.
What Google Ads Experiments Actually Are (And What They’re Not)
A Google Ads experiment is a controlled test that runs a modified version of your campaign (the variant) against the original (the control) simultaneously, splitting traffic between them at a ratio you define. Because both versions run at the same time against the same auction conditions, seasonal noise gets neutralized. That’s the core advantage over before-and-after comparisons.
What experiments are not: a way to test totally different campaign structures, different audiences across campaigns, or broad strategic pivots. They work best when you’re isolating one specific variable — a bidding strategy switch, a headline swap, a landing page change, a match type shift.
Google’s interface separates experiments into two types. Custom experiments (formerly called drafts and experiments) let you clone a campaign, make changes to the draft, and run it against the original. Ad variation experiments let you swap out specific copy elements — headlines, descriptions — across multiple campaigns at once. Both are useful, but they answer different questions. Custom experiments are for structural or bidding changes; ad variations are for copy testing at scale.
If you’re reading this while also thinking about whether your smart bidding setup is actually optimizing toward the right signal, our breakdown of Google Ads smart bidding strategies covers the decision framework in depth — because experiments are the best way to validate any bidding strategy change before it becomes your default.
The Hypothesis-First Rule Nobody Follows
Before you touch the experiment setup UI, write down one sentence: “I believe [change] will [improve/reduce] [metric] because [reason].”
That’s it. One variable. One directional prediction. One reason grounded in something you already know about your account.
“I believe switching from Target CPA to Maximize Conversions will lower our cost per lead because the algorithm currently has 90+ conversions per month and our manual CPA target is likely constraining bid competitiveness in peak hours.”
That’s a testable hypothesis. “Let’s try some new ads and see what happens” is not.
This matters because without a hypothesis, you’ll torture the data until it confesses something that isn’t real. You’ll look at 12 different metrics, find the one that moved favorably, and declare the experiment a success. That’s how you end up making changes that feel validated but aren’t.
Pick your primary metric before the experiment starts. Secondary metrics can inform interpretation, but one metric is the arbiter of win or loss. For most lead gen campaigns, that’s cost per conversion or conversion rate. For ecommerce, it’s ROAS or revenue per impression. Don’t let CTR or CPC gaslight you into thinking an experiment won when your conversions went sideways.
Step-by-Step: Setting Up a Campaign Experiment
Here’s exactly how to run a custom campaign experiment in Google Ads:
Step 1: Navigate to Experiments
In your Google Ads account, click the Campaigns icon in the left nav, then select Experiments. You’ll see options for “Custom experiments” and “Ad variations.” For bidding or structural changes, choose Custom experiments.
Step 2: Create a Draft
Select the campaign you want to test. Click + New experiment and Google will create a draft — an exact copy of your campaign — that you can modify freely without affecting the live campaign. This is the drafts and experiments workflow, and it’s genuinely useful: you can stage and review every change before a single dollar is split.
Make your single change to the draft. One change. If you switch your bidding strategy AND change three headlines, you’ll never know which one drove the result.
Step 3: Set Your Traffic Split
Google lets you split traffic anywhere from 50/50 to 90/10. The right split depends on your risk tolerance and traffic volume.
- Use 50/50 when you want to reach significance as fast as possible and the change is low-risk (copy test, minor bid adjustment).
- Use 70/30 or 80/20 (favoring the control) when you’re testing something that could hurt performance — like a new bidding strategy on a campaign that’s currently your revenue backbone.
Cookie-based splitting means the same user always sees the same version. Query-based splitting randomizes by search query. Cookie-based is more accurate for user behavior tests; query-based converges to significance faster on high-volume campaigns. Google defaults to cookie-based — leave it there unless you have a specific reason not to.
Step 4: Set Your Duration
Run experiments for a minimum of two weeks. Four weeks is better. The goal isn’t hitting a time threshold — it’s hitting a confidence level of 95% or higher on your primary metric, which Google will show you directly in the experiment report.
Do not end experiments early just because results look promising at day five. That’s called peeking bias, and it’s a well-documented statistical trap that causes false positives. Set a calendar reminder for the end date and leave it alone.
Step 5: Launch
Click Start Experiment. Google will serve both versions simultaneously, report performance side by side, and flag when you’ve reached statistical significance. You don’t need to manually calculate anything — but you do need to understand what you’re reading.
Reading the Results: What Actually Means Something
The experiment report shows you side-by-side performance for the control and variant across all major metrics. Here’s how to read it without fooling yourself.
Confidence level is the number that matters most. Google displays this as a percentage — aim for 95%+. That means there’s a 95% chance the difference you’re seeing is real and not random variance. At 80%, you’re essentially flipping a coin. At 70%, you should ignore the result entirely.
Directional signals with low confidence are noise. If the variant shows a 12% better CPA at 75% confidence, that’s not a win. That’s an inconclusive test. Extend the experiment, increase traffic volume, or accept that the difference is too small to detect with your current traffic levels.
Secondary metrics help you understand why the primary metric moved. If CPA dropped but conversion rate also dropped and average order value went up, something more complex is happening. If CTR went up but conversions went down, your new ad copy is attracting worse-fit traffic — a classic false positive scenario that copy testing frameworks specifically try to guard against.
When you have a clear winner at 95%+ confidence, Google gives you a one-click Apply button to make the experiment the new baseline. Click it. It seamlessly transitions traffic without a campaign restart, which matters for smart bidding learning period continuity.
The Four Experiments Worth Running Right Now
Not all experiment ideas are worth your time. These four consistently surface meaningful results across the accounts we manage:
1. Bidding Strategy Switch
Testing a move from manual CPC or Enhanced CPC to Target CPA, or from Target CPA to Maximize Conversions. This is the highest-stakes experiment most accounts should run — the difference between strategies can be 20–40% in CPA on mature campaigns. The risk of just switching without testing is real: smart bidding needs a learning period, and if the new strategy underperforms, you’ve already disrupted your winning campaign. For deeper context on how these algorithms actually behave, see our guide on how Google Ads smart bidding actually works.
2. Match Type Shift
Testing the introduction of broad match keywords against a phrase/exact-only control. This is one of the most contentious experiments in paid search right now. Broad match with smart bidding can work — but in many accounts it quietly inflates spend while diluting lead quality. Experiment before you commit. And if you want the full context on when broad match is worth the risk, our honest take on Google Ads broad match is worth reading alongside your results.
3. Landing Page Variant
Using ad variation experiments to send 50% of traffic to a revised landing page. Landing page changes are often the highest-leverage test you can run — conversion rate improvements of even 15–20% compound significantly against your CPA. Keep everything else identical: the ad copy, the keyword, the bidding. Only the destination URL changes. Good landing page testing principles matter here; our landing page best practices guide covers what the winning variants usually have in common.
4. Ad Copy Test via Ad Variations
Testing a specific headline swap across all RSAs in a campaign or ad group. For example: replacing a features-focused headline (“24/7 Customer Support Included”) with an outcome-focused headline (“Cut Response Time by 40%”). Use the Ad Variations tool under Experiments for this — it’s cleaner than cloning a whole campaign for a copy change.
The Mistakes That Make Experiment Data Worthless
Testing too many variables at once. Change one thing. Seriously. We’ve seen accounts run experiments where the variant had a new bidding strategy, three new headlines, and a different landing page. The experiment showed improvement. Nobody had any idea why, so nobody could replicate it systematically. That’s wasted learning.
Running on low-volume campaigns. If your campaign gets fewer than 50 conversions per month, a standard 50/50 experiment will take six months to reach significance. At that point, the market has changed enough to make the results unreliable anyway. Low-volume campaigns need a different approach — focus on the changes most likely to have outsized impact and make them directly rather than running under-powered tests.
Ending experiments early. Already covered this, but it’s the most common mistake by far. The experiment UI shows you real-time results, and it’s hard to resist acting on them. Discipline yourself to define your end date upfront and honor it.
Ignoring seasonality windows. Don’t run a four-week experiment that spans Black Friday, a major product launch, or a fiscal quarter end if your business has meaningful seasonality. The control and variant split traffic simultaneously, which neutralizes most seasonal noise — but enormous spikes can still skew results if one period is wildly different from the next.
Not documenting results. This one sounds basic, but most teams run experiments, make the change if it wins, and then forget what they tested six months later. Keep a simple log: hypothesis, dates, traffic split, result, confidence level, action taken. Over two years, that log becomes an invaluable account-specific playbook. It also prevents you from testing the same thing twice — which happens more than you’d think when agencies turn over or internal teams change.
Frequently Asked Questions
How long should I run a Google Ads experiment?
A minimum of two weeks, with four weeks being more reliable for most accounts. The real target isn’t a time threshold — it’s reaching 95% statistical confidence on your primary metric, which Google shows you in the experiment report. Don’t end it early just because results look good. Peeking bias is real and it will give you false positives.
What’s the difference between custom experiments and ad variations?
Custom experiments (the drafts and experiments workflow) clone an entire campaign so you can test structural changes like bidding strategies, match types, or targeting. Ad variations let you swap specific copy elements — headlines, descriptions — across multiple campaigns simultaneously without cloning anything. Use custom experiments for bidding and structural tests; use ad variations for copy tests.
Can I run a Google Ads experiment on a Performance Max campaign?
Yes — Google now supports experiments for Performance Max campaigns, specifically to test PMax against a standard Shopping or Search campaign. This is actually one of the most valuable experiment types available right now, given how much debate exists about whether PMax helps or hurts performance for a given account.
What traffic split should I use for a campaign experiment?
50/50 gives you the fastest path to statistical significance and works well for low-risk changes like copy tests. Use a 70/30 or 80/20 split favoring the control when you’re testing something that could meaningfully hurt a high-performing campaign — like a bidding strategy switch — and you want to limit downside exposure while the test runs.
What confidence level is “good enough” to call an experiment a winner?
95% is the threshold worth acting on. Some teams accept 90% for lower-stakes tests. Anything below that is noise, regardless of how directionally promising it looks. Google displays confidence level directly in the experiment report — don’t act until that number is where it needs to be.
Do Google Ads experiments affect Quality Score or the smart bidding learning period?
Running an experiment doesn’t reset Quality Score on your control campaign. If you apply a winning experiment that includes a bidding strategy change, expect a brief learning period (typically 1–2 weeks) as the algorithm recalibrates. This is true whether you use experiments or not — the advantage of experiments is that you’ve already validated the change before accepting that learning period on your main campaign.
Run Tests Like You Mean It, or Don’t Bother
The difference between accounts that compound their improvement over time and accounts that plateau isn’t budget. It’s systematic testing. Gut-feel optimization works until it doesn’t, and by the time you realize it stopped working, you’ve already spent months chasing problems you created.
Google Ads experiments give you the infrastructure to stop guessing. They’re not complicated. The hard part is the discipline: one variable, clear hypothesis, enough time, honest interpretation. If you’re currently making campaign changes and then just watching to see what happens — you can do better.
If you want to see how we structure testing programs inside client accounts — and what a 90-day experiment roadmap actually looks like — this breakdown of what to expect in the first 90 days with a Google Ads agency gives you a realistic picture. And if you’re evaluating whether your current setup is even worth experimenting on top of, a proper account audit is the right starting point.
Good testing doesn’t make your job harder. It makes every future decision easier — because you stop arguing about what might work and start building a record of what actually does.