Follow us and stay on top of everything CRO

Unlocking Information from Inconclusive A/B Test Results

Duration - 55 minutes

Key Takeaways

  • Re-evaluate the pre-analysis phase of your testing and roadmap to ensure you're basing your results on real experiences, not phantom ones.
  • Utilize tools like A/B testing calculators to optimize your business and understand your baseline.
  • Review past test results and filter them through different processes to uncover new insights.
  • Implement a structured approach to testing, which can reduce the probability of running into inconclusive tests.
  • Prioritize quality over quantity in testing. Running a lot of tests isn't necessarily the best approach; it's more important to test the right things.

Summary of the session

The webinar, hosted by Vipul from VWO, featured Kenya Davis, the Senior Manager of Decision Science at Evolytics, who shared her expertise on interpreting inconclusive results in A/B testing. Davis emphasized the importance of a structured approach to testing, including a thorough pre-analysis phase, and the balance between quality and quantity in testing.

She also introduced a 3-step unpacking process for inconclusive results and encouraged participants to revisit past test results using these methods. Ripple praised Davis’s clarity of thought and structured approach to testing, sparking a discussion on the importance of testing the right things over testing more. Davis also offered her assistance to anyone needing help with testing and optimization, providing links to useful resources such as A/B testing calculators.

Webinar Video

Webinar Deck

Top questions asked by the audience

  • What are your thoughts on structured testing?

    - by Vipul
    In regard to structured testing, I kind of chalk it up more to structured experimentation programs and less of like on the test itself because if you have those right checks and balances along the way ..., it honestly sets up every test that deploys up for success in terms of being conclusive. And you can even go into it knowing we may or may not get a result based on how we have our team set up. And to give you an example, I've worked with past clients where, you know, there's a pre-analysis done and the pre-analysis includes checking tagging and checking, the flow of the customers and checking outside data that's you know, not just your internal company's opinions and data. That alone has helped to kind of shave off test that works and don't work or tests that are biased or tests that don't really answer that question. And although that seems like a tedious process, and that is part of that structuredness that you're speaking to. It's something that really allows everyone to feel more empowered. And although it does kind of bring back that ability or kind of cut that ability to run a lot of tests. A lot of tests aren't necessarily the best thing to do. Quality versus quantity is always gonna be a winner, I think for anyone that's in testing.
  • If we have a test, and in this test variant B was winning but not statistically significant for 2 weeks, for example, 80%. Usually, we consider this as inconclusive. So in such a case, what would you suggest we do?

    - by CUDA
    In that case, really there's a lot to look at. If normally conclusive, results are around 90 to 95, for that page and that same KPI, then I wouldn't necessarily use those results if it's just stopping ... at 80%. I would look back at what variants I'm looking at was this distinguishability of it? And I'd honestly jump straight to step 3 I know I said don't skip through, but if you know the setup and you know that the KPI is always the same and the page and location are the same, there's clearly something else happening there. Now if it's like you have 2 tests prior that reached 95 or 99, and this one has 80, you may altogether wanna look at what your level of confidence should be.
  • I'm currently running a multivariate test and my main KPI is conversion rate. So let's say that I've identified that I followed the three steps, and everything looks okay, and that the full test has run through the COVID season from March to May. If my results still lack statistical confidence, between 83 to 85%, but results haven't changed much during the last weeks. We haven't identified patterns or commonalities in the winner variations. Should I stop testing or declare a winner now? Or should I, you know, keep running the test?

    - by Gerardo
    So there are a few ways to look at this one. I'm pretty sure I've got any saying once you get through the three steps. What should you do at that point? I would say that, As I mentioned before, the l ...ast one, if you're always peeking at the the higher end of your confidence being around 90-95, break down that test. So if it's not giving you the answers, then maybe somewhere within that funnel there's some type of variation happening along the way that's causing some type of uncertainty. For the calculation of it, staying around 83 to 85. So I wouldn't necessarily say turn it off if it's possible to run it concurrently with, a much more specific test that's slightly under the KPI of conversion rate, then that will give you a little insight, between the groups of at what point is it starting to really vary?
  • What is the most common reason for an inconclusive test, given the examples that you've listed?

    - by Carrie Wilkins
    I would say the most common that I have seen based on the ones that I've listed has been that the easiest and the worst thing is setting it up at the wrong point. So, firing it at the wrong time. As w ...e went through that first step where it said the person set up the testifier login versus sign up versus the site level. What that means is that the customer, all the customers, you can call that the word I'm looking for, client-side version of testing. Sorry. And if we look at the way that this test should have been set up because they're getting 2 completely different sites, it should have been server-side. And I know that's the argument of client versus server side, but really it's a logical thing. Like, you wanna have both. There isn't an either-or type of statement with it. It's more of what gives you the ability to measure even site conversion at the end of those. If you have one that's their whole website is consistent. Their experience is consistent. There are sprinkled promotions in and out. There's sprinkled this and that but they're all tied to one unique program, then maybe those people need a completely different website experience than the ones that are not seeing any of that. And I've seen that it basically boils down to how should the test be set up, and I've seen that run for a week in arguments upon arguments, and it almost being settled by just setting it up in different ways that being an experiment of itself.


Disclaimer- Please be aware that the content below is computer-generated, so kindly disregard any potential errors or shortcomings.

Vipul from VWO: Hi everyone. Thank you so much for joining in for this webinar. I hope you and your family are safe inside your respective homes, and I wish you all good health. My name is Vipul, and I am the marketing ma ...