In eCommerce, A/B testing is supposed to give you clarity.
In reality, it can sometimes introduce noise.
Many brands still run experiments through JavaScript-based tools layered on top of their storefront relying on client-side rendering and, in some cases, introducing performance variables alongside the test itself.
That’s the nuance:
You’re not always just testing a feature you may also be testing how that feature is delivered.
With the introduction of Shopify Rollouts, that model starts to evolve.
This isn’t about replacing existing tools.
It’s about introducing a cleaner, more native way to run certain types of experiments directly within Shopify’s core.
What is Shopify Rollouts? (And Why It Matters)
Shopify Rollouts is a native, centralised system within the Shopify admin designed for scheduling, testing and controlling changes to an online store.
At a functional level:
- Rollout = A/B test
- Control = your current live theme (A)
- Treatment = a modified version of that theme (B)
But the real distinction is how it works.
Instead of injecting changes via JavaScript, Shopify:
- Stores theme customisation states (JSON templates)
- Serves different versions server-side
- Tracks performance natively within the platform
That alone eliminates a significant portion of technical debt in traditional CRO setups.
What Actually Changes With Rollouts
At a glance, Rollouts looks familiar. Control vs treatment. Traffic split. Standard metrics.
But the mechanism is completely different.
Traditional tools sit on top of your storefront. They inject changes after the page starts loading. That can introduce flicker, delay rendering and create a slight gap between what’s tested and what ultimately goes live.
Shopify removes that layer.
Instead of injecting changes, it saves theme configurations the JSON that defines layouts and sections and serves different versions directly from the server.
How Shopify Rollouts Compares to Traditional A/B Testing Tools
Setting Up a Rollout (Where Most Teams Slip Up)
Rollouts sit inside Shopify Markets, not the theme editor.
.png)
The setup is simple, but easy to misconfigure.
First, naming.
At scale, this matters more than it seems. Without a structured format ticket IDs, test names you lose visibility fast.
Then comes traffic.
Shopify splits this into two layers:
- How many users enter the test
- How those users are split between variants
This is where mistakes happen.
If you set 50% entry and 50% split, you’re not running a 50/50 test. Half your users aren’t included at all.
To run a proper A/B test:
- Set 100% traffic entry
- Split 50/50 between control and treatment
You also define timing start, end and what happens after.
.png)
This is where Shopify removes friction. You can choose to automatically apply the winning variant, turning experimentation into deployment.
The Treatment Theme (What You’re Really Editing)
This is where expectations need to shift.
You’re not creating a new theme.
You’re not writing new code.
You’re defining a variation of the live theme using the customiser.
Once you click “Add changes”, Shopify opens the customiser and captures everything as a separate configuration.
That includes:
- Sections and layouts
- Content and imagery
- Theme settings like colours and typography
- App embeds (in many cases)
But not everything is available.
You can’t test .liquid changes.
You can’t touch checkout or accounts.
And you can’t run multiple theme variants.
Under the hood, Shopify is simply saving JSON states and serving them to different users.
.png)
How We’d Use This at WIRO (Real Workflow)
For new components, the process is clean.
Build both variants.
QA them properly.
Add both to the live theme.
Then assign:
- Variant A → control
- Variant B → treatment
Run the test. Pick a winner. Remove the rest.
For existing components, it depends on complexity. Sometimes a toggle works. Sometimes duplication is cleaner.
Either way, everything stays within the customiser layer.
That’s the constraint, but also the strength.
WIRO Example: Where This Changes Outcomes
We’ve seen this scenario repeatedly with traditional tools:
A brand tests a new PDP layout using a JS-based platform.
The variant appears to underperform slightly.
On the surface, the conclusion is clear: revert.
But when you dig deeper, the variant introduced additional JS, slowed down rendering and increased layout shift.
The drop may not have been entirely driven by UX, it can also be influenced by how the test is delivered, particularly if additional scripts impact load or rendering.
With Rollouts, that layer is removed.
Which means you’re closer to measuring the true impact of the change itself, rather than any side effects from the testing setup.
During a Rollout: Where Things Can Break
Once live, rollouts appear in the Online Store channel. You can monitor, pause, or end tests at any time.
But there’s one rule that matters.
Changes to the live theme affect the control.
.png)
If someone edits the customiser mid-test, your baseline shifts. Your data becomes unreliable.
Code changes are safer they apply to both variants but anything that alters JSON templates introduces risk.
Best practice is simple: don’t touch the customiser during a test.
Analytics: Simple, But Enough
Shopify provides the core metrics you actually need to evaluate performance conversion rate, add to cart rate, reached checkout, bounce rate and sessions. It’s not overloaded with unnecessary data, but it covers the fundamentals.
What makes it more valuable is the inclusion of confidence intervals. This shifts the focus from just seeing uplift to understanding how reliable that uplift actually is. You’re not just looking at results you’re assessing their statistical strength.
For a native tool, that’s a solid foundation.
.png)
Ending a Rollout (The One Risk to Manage)
At the end of a rollout, the decision is straightforward: apply the treatment or discard it.
Applying the treatment is immediate, which is one of the biggest advantages of Rollouts. However, it comes with an important trade-off. When you apply it, Shopify overwrites any customiser changes that were made to the live theme during the test.
That means teams need to be disciplined in how they manage changes. Either avoid making customiser edits while a rollout is active, or ensure those changes are manually synced before applying the final result.
Where Rollouts Wins (And Where It Doesn’t Yet)
Rollouts performs best in the areas where most CRO activity happens - UX improvements, layout changes and merchandising experiments. It removes performance-related issues, eliminates the need for duplicate implementation and significantly speeds up deployment.
That said, it’s not a complete solution yet. There’s no audience segmentation, no support for multi-variant testing and no ability to experiment with backend logic or .liquid changes.
For now, it’s a powerful tool within a defined scope but one that’s likely to expand quickly.
Final Thought
Shopify is steadily moving towards owning more of the optimisation layer.
Rollouts is part of that direction.
It’s not a complete replacement for every CRO tool, but it sets a new baseline for how testing can work natively within the platform.
And for many UX and merchandising experiments, it offers a simpler, more controlled way to move from idea → test → deployment.



.png)
