Peeking

What is peeking?

Peeking early at results in modern A/B testing environments presents the opportunity to swiftly detect significant differences with minimal data. However, this practice of prematurely halting experiments by continuously monitoring the dashboard can lead to biased conclusions. Such an approach may inadvertently favor results that appear significant due to random fluctuations rather than genuine effects.

Peeking can result in a Type 1 error, leading to a false positive. This implies mistakenly concluding that your hypothesis testing has succeeded when it actually hasn’t.

Example of peeking

Let’s say you’re testing two different versions of your website’s homepage to see which one leads to more purchases. After just a few days of running the test, you notice that variation 1 of the homepage is performing significantly better than variation 2. Excited by these early results, you decide to stop the test and implement variation 1 across the entire website.

However, what you fail to realize is that the initial success of variation 1 could have been just a temporary fluctuation or random chance. With peeking, you are likely to detect a difference when there is no difference (false positive). By stopping the test too early and making decisions based on incomplete data, you risk implementing a change that might not be beneficial for the business in the long term. The peeking problem in A/B testing leads to premature decisions based on early results, which can be biased due to exaggerated uplift values from smaller sample sizes. This compromises statistical significance and can lead to incorrect conclusions.

When is peeking allowed?

Peeking can result in false positives and inflated uplifts due to limited sample sizes (also called the winner’s curse).

As a result, experimenters must choose between fixed horizon testing and sequential testing based on their approach to handling peeking.

In traditional fixed horizon testing, where experiments are run for a predetermined duration, peeking at results before the test concludes is generally discouraged. This is because waiting to complete the testing duration ensures sufficient data for reliable conclusions, helps maintain statistical validity, reduces bias, and minimizes the risk of Type 1 errors.

However, in modern businesses, sequential testing methodologies have emerged, allowing for more adaptive experimentation and timely feature launches. In sequential testing, data is continuously monitored, and decisions about stopping the test or making adjustments can be made along the way. This approach accommodates peeking to some extent, as long as it is done cautiously to avoid premature conclusions based on incomplete data.

VWO has implemented Peeking Correction to ensure that Sequential Testing is accurate and reliable in its revamped reporting system. By using Peeking Correction, VWO adjusts statistical calculations to maintain validity, even when tests are monitored multiple times. This feature helps maintain the integrity of your results, allowing you to make informed decisions without the risk of skewed data.

If you’re seeking both flexibility in reviewing test results and high accuracy, give VWO a try—it comes with robust and dependable reporting features.

Explore more Glossary terms

Personalization

Personalization is essentially used to refer to a variety of algorithms and strategies that online businesses use to customize the experience on their websites for different segments of the audience.

Proximity Principle

The proximity principle suggests that elements placed close together are perceived as part of the same group, sharing similar traits.

Qualitative Visitor Research

Qualitative visitor research refers to a set of methodologies aimed at exploring the subjective and nuanced aspects of visitors' experiences, behaviors, and preferences.

Quantitative Visitor Research

Quantitative visitor research involves collecting and analyzing numerical data about visitor behavior, interactions, and preferences on digital platforms such as mobile apps and websites.

Deliver great experiences. Grow faster, starting today.

Features +125 more

Features +120 more

What is peeking?

Example of peeking

When is peeking allowed?

More Resources

Explore more Glossary terms

Personalization

Proximity Principle

Qualitative Visitor Research

Quantitative Visitor Research

Deliver great experiences. Grow faster, starting today.

While we will deliver a demo that covers the entire VWO platform, please share a few details for us to personalize the demo for you.

Select the capabilities that you would like us to emphasise on during the demo.

Which of these sounds like you?

Please share the use cases, goals or needs that you are trying to solve.

Please provide your website URL or links to your application.