AI Bootcamp for Experimentation
Craig reveals how AI transforms product development and experimentation, sharing practical strategies from the 'AI Playbook' to enhance critical thinking and innovation.
Summary
The ConvEx 2024 workshop led by Craig Sullivan explores transformative strategies in customer optimization using AI and large language models (LLMs). Participants learn to address customer pain points through problem exploration, ideation, and experimentation, emphasizing real-world application.
The session showcases hands-on techniques, leveraging AI to analyze feedback, structure problem taxonomies, and brainstorm creative solutions for customer experience improvement. By integrating validated insights and diverse ideas, attendees leave equipped with tools to foster experimentation and productivity in digital transformation.
Key Takeaways
- Learn to optimize customer journeys with AI-driven insights.
- Discover foundational principles for leveraging LLMs like GPT and Claude.
- Reduce solution bias by connecting ideas to real customer problems.
- Empower teams with creative, efficient AI-driven ideation processes.
Transcript
NOTE: This is a raw transcript and contains grammatical errors. The curated transcript will be uploaded soon.
Well, hello, everyone.
Welcome to Convex two thousand and twenty four by VWO.
My name is Vipul, and I’m the senior marketing manager at VWO.
Thousands of brands across the globe use VWO to optimize the customer experience by gathering insights, running experiments, and personalizing their purchase journey.
Day one at Convex two thousand and twenty four starts with an amazing and interactive workshop by none other than Craig Sullivan. So I’ll just quickly invite Craig Sullivan to come up on stage, please.
Could you give us, you know, a quick gist of what this workshop is going to be and what will the takeaways be, Craig?
What this workshop is is it’s slightly deceptive because what we’re doing is, we’re teaching you how to play with Lego, the fundamental, foundational building blocks of the new processes and workflows that are gonna replace the way we do things now. Right? All the things that you do now and the way that you do them now, it’s a giant house fire. Right?
And it’s happening already. Right? So this is all about having some skin in the game here. And by showing you some of these fundamental building blocks, it means you can then work out, this is how I might use this to solve my problems or get AI to assist with my work flow or help me with experiments, help me come up with ideas, help me solve customer problems, help me write good hypotheses, right, that would stand up to scrutiny, how to come up Right.
With better experiment designs. So all of this is about not AI replacing people, but making, CRO and experimentation teams faster, better quality ideas.
Mhmm.
So today is about some hands on techniques to show you what these little atomic parts are. Once we teach you how to play with the LEGO bricks, you can go off and make your own fairy castles. And that’s what a lot of our clients end up doing is building tools and workflows for themselves.
So it’s not it it starts with chat.
Right?
Yes.
So that that’s just really getting on the bicycle. Soon you’re flying in airplanes. Right?
This is the Right.
That’s right. Cool. I think that sounds awesome. And I know that, the workshop is going to run for almost two hours, so I won’t eat up, your time, Craig. You’ve prepared this very diligently.
Awesome.
Good afternoon, everyone, from, gray and rainy London. I’m delighted to be the opener for, Convex twenty four with a workshop for you. And the slides are here on this link, and I’m just gonna put them in the, chat window, and you can access those immediately because they’re already up there.
You’ll also find in the files tab this book that’s on the cover, and this book covers way more than we’re gonna cover in the workshop today. We’re just giving you a little taster of what’s inside and what we do with people.
So who designed these workshops? Massive props and credit to Iqbal here because although I’ve spent thousands of hours testing all the prompts in the book in these workshops. I was working all day on this yesterday checking all the stuff.
He was the one who really came up with a lot of the heavy thinking behind this and has been working directly with product teams. So he’s been rewiring the way that people work, and the two of us have been turning this into real tools and real approaches that people can turn into production systems. So although he’s not here to present with me, he’ll be here for questions. Please save up some hard ones for him. You’ll love that.
So what are you gonna learn today? Some foundational principles and techniques for using LLMs. Right? Chat AIs. Right, things like Claude and GPT and perplexity and so on. We’re gonna show you how to improve outputs using a conversational approach, right, and how to turn customer problems or feedback into really great ideas and AB tests.
We’re gonna do it together with a series of practical exercises, and you can then steal all this work. That’s what the book is for. I want you to steal the book and steal all these slides and take credit for it. Right?
Because that’s the way that this will get transmitted out into the world. Play with this stuff and then make it your own. That’s why we put it together. So read the book or get a workshop for your company.
Talk to me later.
These are the things that we’re gonna cover today. I have a little introduction, then we’re gonna do some hands on stuff, and then it’s pretty much into questions. So, most of the workshop is gonna be a hands on exercise.
The first bit of work for you is I want you to go to, perplexity dot a I, which I’ve put in the chat window. Right? And you need to kinda sign up for that. It’s a very quick sign up.
It won’t take you very long. You can use your Google ID to authenticate with that, but you need to be logged into that. Otherwise, you can’t attach file. Right?
And we’ll be doing that later on. So whilst I’m doing the introduction, you can be sorting out perplexity.
We’re also using a test dataset. Right? So you’re gonna need this file to download. Right? It’s a little PDF.
And, again, I’ll put the link in the chat, bit. Ly slash skill reviews, and that will download the PDF onto your computer. Once you’ve done those two, you’ll be totally ready for the hands on bit, which comes shortly.
So let me cover the introduction.
Is this the advent of advanced general intelligence or a new god? No. I’m not worried about LLMs, not in the way that people think, maybe. I’m worried about the intent, how people use them practically in the real world.
I’m much more worried about autonomous robots. Right? Pieces of software running around with nobody given any oversight to them. That’s genuinely terrifying to me, but LLMs are not.
It’s a new machine for me. It’s very useful for CRO, UX research, experimentation, all sorts of things. But it’s like most stuff. If you put garbage in or bad thinking in, you will get garbage out, and a lot of people expect it to be magical, and it takes effort.
Right?
One of the things I want you to remember after today is before you type, stop and think about what it is that you want, what’s the goal or outcome, and try and work back from that. Just lean back in your chair and fold your arms and think for two minutes before you start tapping on the keyboard, and you’ll find the result will be better. And once you’ve run your prompt or prompts, then check the outputs. Right? Make sure you can trust this stuff.
And LLMs or chat APIs, imagine they’re a giant lake with all the words from the Internet, all the things that people have said or written poured into them. You can go fishing in this lake and find some wonderful things. But if you fashion your hook wrongly or the data is very limited, you may not hook what you’re looking for in that lake of words. So it’s as much about the hook that you fashioned as is what’s the information you’re actually fishing for in the lake.
Oh, okay. You can’t open that last item. Let me just check the Bitly link here. Bear with me.
Thank you for telling me.
Skill reviews. I may have missed oh, no. That’s correct.
Let’s just put the link in here.
It’s blocked for your firm. Right?
Some people may have this blocked as a corporate thing. Downloading it, trying to think where else I could put this file.
Ah, yes.
Maybe if one of the guys could from VWO could put this file in the files tab, please, Vipul?
Let me know.
Yeah. It’s so let me just quickly do that.
Thank you very much.
The skill reviews link. Right? Yep.
Yeah. The skill reviews link. It will download a small PDF to your computer.
Just give me a few seconds.
Yeah. You’re right. It’s a list of complaints.
Just give me a few seconds, guys. Let me just, because we’ll have to upload it.
Yeah. It does lack a lot of data, but it makes connections that humans, don’t, or cannot make or forget to make, Lindsay.
So it’s capable of making more connections than humans normally make when given the same amount of data. And when given given considerably more, it’ll make even more connections.
We’ll cover this. Don’t worry.
So in the files tab, you will, you will get that sorted out shortly. So I’ll I’ll I’ll come back. Most of you should work for, but that’s good to know some people’s laptops are blocking downloads.
So quite right, Lindsay. It’s very good with words and word frameworks. Right? That’s one of the things that we have discovered from a lot of playing over the last year.
But here’s an important thing. When satnav came out, people didn’t think, oh, the car will self drive now. Just put in the destination, and it goes there. It doesn’t work like that.
You’re driving. Your role as a human is to drive the car. You set the destination. You get the passengers in.
You make sure everyone gets there safely and on time. You are the executive function.
The AI is a driver assist here. Right? It’s helping you get to your destination with less hassle, less friction. Right?
It’s not actually doing the driving. You’re not sitting in the back playing candy crush while AI drives you around. That’s not here yet. Right?
You can’t assume it’s right. Just like you can’t assume you’ve got the right destination in your satnav. You check to make sure there aren’t two cities with that same name. Right?
So you’re driving, not AI, and you should really treat it like an enthusiastic intern. Right? It comes up with some really brilliant ideas, but also some very naive and stupid ideas as well.
So AI can help you with all of these things, and this list is all stuff that’s covered in the book, but it works at a task level. Some stuff works, some stuff does not, and it’s only by playing with all of these that you’ll actually find out what works.
The file is now available under the files tab, by the way, so you can get the Skillshare reviews, and we’ll come to that in a minute.
So thinking before you type, work back from the outcome. Right? Give the chat interface context and background, upload files, style guidelines, PDFs, text files.
Give it clear instructions. This is really important. You don’t need to give it complex instructions. You just need to give it clear instructions because overly complicated mega prompts are not always necessary.
These make good recipes once you’ve figured out the ideal recipe to cook. Right? But they’re not necessarily the right way to start. The way to think of a lot of the right prompting strategies is as a chain, a conversation, a series of interactions, and this is called chain of thought.
You can look it up. And if you ever get stuck and you think, I don’t know how to ask AI. I don’t know how to write the prompt to get the thing that I want. Well, you can ask the LLM, and it will tell you how to to prompt it.
And I’ve tested this a lot and it gives really good answers. Right? So you don’t need a manual. Right?
You can just ask it if you get stuck.
And Ethan Mollick’s quote in this is good because he says here, we don’t know if there’s a special skill involved or if it just requires a lot of time spent with chatbots. And he’s absolutely right because this is what we find with working with people. Imagine if the first time that you got on a bicycle aged five, you fell off and decided never to ride it again, or you got on a skateboard, and the first time you tried to trick, you fell off and you decided never to do skateboarding again because it was dangerous. You would never get to the point where you mastered it and saw the point of it, right, if you give up early.
Right? So it is like riding a bike here. You shouldn’t be scared of using this stuff. You should just get more and more familiar with it.
So even though people may struggle with learning how to ride a bike when they first start out, everyone within a reasonably short period of time manages to get to the stage where they can ride around with their friends. Right? And that’s what I want you to do with AI.
So here are some do’s and don’ts. I’ve left more details in the deck, but you need to check your sources. Right?
Always ask AI for the sources for stuff. It might be a terrible source. You don’t know unless you ask.
You need to prime your chat with data. If you’re asking AI to write content or come up with an answer to something, the more data background documents you can attach, the better it’s going to get. That seems like a really tiny little point, but trust me, it makes a huge difference.
Make it your intern. You’re in charge here, and don’t use it to replace voice of customer or customer research. Right? It’s not a replacement for customers. It can’t synthesize the reality of customers, but you can use it to amp up your research. And you should always turn training off particularly if you’re uploading sensitive data and you can do this at an account level.
So even something as simple as please explain the theory of social proof to me becomes a lot more interesting when you know how that result is compiled and what are the books and sources. So I can see that these are credible sources they may not be right but always ask for this stuff when it’s a research question like that.
If you want to turn off training on your data I’ve put all the ways you can do this into the slide deck with GPT. So you can turn off data controls, you can disable your account entirely for OpenAI, you can also use the chat GPT playground, and this this, will not, send any training, will not use your data for training by default. And the same is true of custom GPTs. You can turn off the setting here.
This little checkbox will disable it.
Some don’ts here, don’t be scared about this stuff, the more you ride the bike the easier it will get.
Please don’t ask for math stuff unless your LLM is capable of doing maths and concrete facts. It’s lousy for this stuff. It’s very good with words. Right?
Don’t assume it’s right. This is where a lot of mistakes come from. People think, the audio looks amazing. It must be right. You have to check it.
If any of you have tried to upload text or reviews or stuff and get, chat interfaces to count them, stop doing it. It doesn’t work. You can’t count stuff reliably. Try running it ten times.
You won’t get the same numbers. Right? And there’s a process to doing it. Iqbal can talk about this later on.
You won’t get good market stats unless it’s a huge market.
So, again, check the sources here. You might find it’s two really terrible blog articles that are all the sources for the market stats are being presented to you. If it’s a small vertical, there may not be much data on this, you know, on accessories for hamsters in in Malta.
Please don’t expect magic.
Trial and error and failure are part of this. And until you have an operationally working system where you validate the outputs, you can’t trust what you’re given. Right?
GPT and other chat, interfaces are not a replacement for talking to and listening to customers. They’re a way of helping you achieve that at scale. If you make a recipe of product without any customers in it, that’s exactly what it’s going to taste like.
Yes. It is really good at coding and writing formulas for Excel and writing script code for all sorts of things. Good point.
Get everyone to play with these tools, especially the most skeptical people because familiarity will show them how they can apply this to problems that they already have. And until you do the playing, you can’t actually see how it’s going to help you. So familiarity is absolutely key, and that’s because understanding the art of the possible means knowing what works and what doesn’t work, and a lot of this is covered in the book. Most gains that client my clients here are getting are from speeding up stuff they’re already doing. It isn’t through some brand new amazing thing that they’ve discovered the AI can do that that can’t be done, that wasn’t able to be done before. Most gains are from speeding up really boring or repetitive stuff that we do already.
And tools and companies are ahead of the curve here. People have got skin in the game here and are playing with this stuff and iterating it and experimenting and failing will be swifter to apply it as it improves. Right? So you either were stuck in your local maxima, your foothills of the AI revolution, or you’re actually on this bigger mountain.
You you have to be there to actually take advantage of this. And there’s an evolution here. We’ve we’ve seen it working with people even over the course of a few months. That evolution starts with people playing around with the chat interfaces, and then they try lots of different tasks.
Right? And some of them work, and some of them work incredibly well. So they then begin to sort of productionize that and build conversational strategies and change that they use. Oh, I use these prompts to do this thing, like, every day or every week.
Right? And then those get augmented by people uploading documents. So I’ve uploaded our style guideline for this sort of content creation, and then, oh, I’ve created a custom GPT that now has all our corporate documents loaded up so it knows all about x, y, and zed. Right?
And then a further evolution where humans are actually orchestrating a series of agent workflows or individual agents in doing tasks. Right? And, also, AI agentic workflows where AIs are actually autonomously dealing with parts of the entire workflow. And all of this just started with playing around chat.
Yeah. Sorry for everyone you’re, having audio issues.
And this is the way I like to think about it. I I I’m I’m not a fan of replacement.
Right? The replacement theory of AI. I’m a fan of the augmentation theory. Sigourney Weaver in the film aliens puts on this exoskeleton.
She’s still the same Sigourney. She’s just faster, better, and stronger, and she thinks exactly the same way. Right? So she is augmented. She’s extended by this technology, and that’s you on AI.
And what happened after our workshops?
Well, all sorts of interesting things happened. One client went off and used AI to generate a whole host of product and lifestyle images, and that worked amazingly well. That’s completely changed the way that they’re building their ecom product.
Another client has integrated the ideation we’re showing you today into Airtable, right, to basically give people more experiment ideas based on the research and data that’s been put into the repository.
Another client has taken open source software at no cost and replaced their very expensive language translation solution with one that does it all completely free. They’re even able to change the tone of voice of this stuff when they do the translation work, so they can experiment with it.
Another client is streaming all of their call center voice to a text repository so they can analyze it using LLMs.
It’s amazing. They can spot trends way faster than their existing software can. So efficiency is an obvious one here, but the surprising one for me, on top of just task time saving is the quality and diversity of ideas. That’s one of the biggest impacts here on people’s creativity and coming up with good quality experiment ideas.
So the result of this stuff is higher quality experiment designs, hypotheses, ideas, understanding of customer problems, and all of this helps to reduce solution bias.
So a little word on prompting mastery.
And these are four key principles that Egbal has put in here, and they’re they’re really quite vital, and you’ll see them, like a stick of rock throughout the rest of today.
And the first one is don’t think of one interaction. Think in terms of a conversation and multiple interactions. You’re sitting down with the AI. Oh, I’m kinda dealing with this problem. What do you think? Well, you could do this thing or that thing.
Yeah. Let’s talk about that thing. That sounds like a promising idea. And it’s like a conversation.
It develops it. It goes down dead ends, but you’re having a back and forth. Right? It’s not just a a a one shot affair.
So think in terms of multiple conversational process to validate the output. If you’re building a workflow to process something, to analyze something, to create something, you need a system to check and validate the output.
That could be an additional piece of software. That could be some manual checks that you do. It could be something that’s automated, but you need that in case something goes wrong. Right? I know it sounds crazy that we need to build software to check the output of other software.
Welcome to the world of LLMs.
You should expect to iterate, to ask questions, and then have to refine them. Don’t expect to get it magically right and impress yourself with how clever you’ve been. You may take several attempts to get it right. It’s just you’ve fashioned the fish hook wrong. Expect to try it several times before you realize that you’re getting what you want to fish for.
And last but not least, interact like a boss. Right? You’re in charge. Stop feeling that you’re subservient or not as smart as AI. You’re way smarter than AI because you’re the one with the executive decision making. You’re the one that knows about the company.
AI should be your your servant. Right? You should be instructing it, telling it what to do. Do it now. No. That’s not right. Do it better.
It’ll certainly make you feel better anyway.
But our prompting philosophy here, right, just to cut through all this BS, is there’s all these prompt engineering courses and prompt mastery and, you should use these fifty prompts for ecommerce. Right?
BS. Right? All these books and courses are just BS. You don’t need them. Right? You really do not because it’s all about interaction patterns, conversational strategies that you develop for engaging I, AI, with the problems that you actually have.
Right? The tasks that you’ve got. You don’t need complex prompting structures or mega prompts. Maybe when you’ve built something that’s a recipe you want to reuse on a regular basis, but there’s no need for a lot of this stuff that’s recommended.
And these guys just don’t know what they’re talking about. Right?
It’s not even the right method for engaging with a new problem. That’s best started with a conversation.
You need to break this stuff into chunks first. Right? Break it into steps. Because if you say to AI, I’d like you to process this and put it here and do that and then do that thing and then this thing and then do that thing and then put it in Excel. Right?
And then it gives you the Excel and you think, I wonder if any of that is broken, and if so, which bit of it was broken?
But if you broke it down into six steps and did each one, you’d be able to check the workings were actually happening correctly each one. It’s no longer a black box. Right? So try not to do mega prompts when you first figure in something now.
Break it into steps because it allows you to check it and review it. And this is called chaining versus one shot prompting. Right? And there and there are different times when there are are useful.
Right? But a lot of these courses and people talking about this stuff don’t understand that. It’s more about your thinking. Bring your brain to the table, right, for the typing.
It’s not about complex prompts or any of these pro make a prompt.
One warning. Right? This role playing thing where you say you are a UX researcher or be a UX researcher or you are a nuclear scientist. Right?
This is a tonal thing. Right? Don’t use it. If you explain the problem and say, this is the problem.
I need this sort of output. Right? And give it the context and background. You don’t need to ask it to role play.
Right? To put on a persona. Right? You’re totally cut coloring the output, and that bias may be useful sometimes, but, generally, most of the time, that bias isn’t.
If you ask it to role play, say, write this as a UX researcher, it will go on about things like heuristics and stuff that are great for other UX researchers, but not necessarily for the audience you’re presenting this to. Be careful with tonal stuff. Right? It can bias the output in unintended ways.
And here’s a great example of one shot versus chaining. Right?
A lot of these courses and stuff will say, here’s a big mega prompt. Right? Cut and paste this. You know?
Fifty prompts to cut and paste today. Right? And they’ll say, type all this in. I’m a marketing executive, blah blah blah.
And it’s usually lots of, like, ego driven stuff. We’re great. We do this. Right? And it’s it’s not gonna solve the problem or get you closer to it.
You’re just gonna get a lot of mediocre output. But think about it differently. Let’s just break it down. Let’s brainstorm some ideas around this.
Right? And you have a ten minute conversation, and you kinda notice there’s actually a lot of hotels that are offering kinda wellness retreat packages and yoga and aromatherapy and reflexology. Yeah. They maybe that’s an angle for us.
Right? So let’s continue the conversation, but make it about well-being experiences. Right? And then that goes on for another fifteen minutes.
And then you say, we’ve had a good conversation. Give me a list of ten ideas based on that discussion. I guarantee you’ll get a way better result than that one shot. Right?
And that’s because you’ve had a conversation.
Right? You’ve not tried to do it in a one hour. And this book, this is the UK Kindle store, but this book, is available on Amazon in most countries that I’ve checked. Please let me know if it isn’t available in years, and I’ll encourage the author to make it available. There are about thirty pages in this book that are well worth reading and will teach you about lean prompting. Right? So that’s all you need.
So top tips here. Conversation’s not one big mega prompt. Iteration and tuning. Right?
Asking for variations. Don’t ask it for one better button copy. Ask for twenty and throw away the ones you don’t like. Upload documents, right, to Prime AI and give additional context.
Always check the outputs and always ask AI during these conversations to reflect on the work that it’s done just to get it to go back and think if it could have done any better or done it differently.
So no big courses needed. No hours and hours of time required. These are the three things that you need for prompting, mastery. The playbook, which is in the files tab. There’s a a whole section, of prompting thirty pages on prompting work there.
There’s this book which covers the lean prompting technique that I’ve just shared with you there which is free if you’ve got an Amazon Prime membership and just start using it for stuff. Those are the three, right? There isn’t any special skill, as Ethan said, needed. Just get stuck in.
Great.
Let’s do some prompts now. Right?
Introduction bet over. Let’s get stuck in. Right? I’ve seen this quote so many times recently. I don’t know whether it’s becoming a meme or something, but I’ve literally seen it in five presentations the last two weeks. But it’s a really good quote because we don’t spend enough time when building products, right, actually thinking about the customer problems and defining them, right, and we should really be spending way more time on that and way less time on coming up with solutions.
And you see, solution bias here is the tendency to favor certain solutions over others. Right? We’re working with a product. We’re working with a team.
We think we know what the problem is. Right? Oh, it’s customers, like, not understanding the cancellation process. Oh, if we do that, like, that should solve it.
Right? That is solution bias. Right? You you’re thinking that you know what’s gonna solve the problem or shift the behavior, or you prefer solutions that are amongst the set that you’ve kind of all discussed internally already.
Right? So, of course, you’re gonna gravitate to this, and, of course, you’re gonna be biased, but this is a huge issue for experimentation and product teams. Right? Because it creates solutions with no connection to the problem whatsoever.
So if you’ve got a plumbing leak in your house, that’s great. I’ll just buy some pigeons so that they can swim on the lake in my kitchen. Right? That’ll solve the problem.
Right? And that’s a lot of experimentation that we see in the backlog. We think, why is this stuff in the backlog? And the reason is is because somewhere along the line, someone has lost the connection to the original freaking problem and the understanding of it.
Right? What you actually need is a plumber, not a pigeon to swim on it.
And this is the whole problem. Right? You may have a process flow like this for experimentation, research, discovery.
Just insert your product or experimentation, life cycle here. But at every step of this, if we work back from the experimentation backlogs we look at, we think, why is all this terrible these terrible AB tests in here just really crazy stuff hasn’t been thought out?
We see that the whole problem is caused by relevance bleed along the whole chain. It’s like past the parcel. Right? Someone says, you know, like, who put this experiment together? And they say, don’t look at us. We didn’t design it.
Those guys there give it to us to build. Right? They threw it over the hedge to us and said, right. You build it now.
And this relevance bleed at every step is causing the final thing that you end up testing or changing being completely disconnected from the problem. At some stage, that linkage is lost. Right? And the connection with the root cause, which is really important, what’s actually driving it, that actually gets lost in the grand scheme of things.
But if we actually use techniques and AI techniques to dig into taxonomies of problems like this, we can actually figure out what their structure is like, and we can also figure out what are the things that drive and combine all of the problems and root causes. So what are the problems and the subproblems and the subproblems that make up the problem taxonomy structure inside, say a piece of text or customer feedback.
So all the raw data that we analyze whether it’s from call centers, reviews, customer emails, all of this has the taxonomy of problems or delight in there, right, and sub problems and root causes. No matter how we process the data, there’s always this taxonomy there and there’s usually one, two, or maybe even three levels deep in it.
And what we could do here is use AI to help us fill out this taxonomy and by filling out these issues we actually end up understanding a lot more about the root causes, but also how these problems and subproblems connect together. Right? How the whole machinery of the problem actually is structured in here, and this is one of the very interesting things that we can get out of AI.
Yeah. Five whys, you’re absolutely right, Varon. It’s a very similar thing. Right? And we can walk up that tree.
Just some admin here at this point. All prompts and instructions that I show on the screen, they’re gonna be on the slide, but I’m also gonna paste them into the chat as I do them.
Right now, if you can, please open up a browser tab for perplexity dot a I. And, if you haven’t authenticated, if you could do that now, please.
As we do the practical tasks, I’m gonna give you a set amount of time to do them. There’ll be a timer running on the screen. Right? No pressure.
I’ll be quiet during those periods, but if you have any questions in the chat or you run into any problems, please ask me. But I’ll just shut up during those times to let you all work. Right? But there’ll be periods where I’ll ask for some feedback.
So any problems let us know in the chat.
And why perplexity?
Because you don’t need a paid subscription to upload a file which we’re gonna do.
So please go to perplexity dot ai and also download the pdf file from the files tab because you’re going to need that in a minute.
And we start. Problem exploration.
In that dataset that you should have downloaded by now, there are five hundred and sixty five negative reviews of skillshare dot com, all the really bad stuff. Right? We’ve taken all the most negative sentiment, the problems, all the hassles, and put them into this one file, and we’re gonna use that, for this exercise. But this could be your own data.
It’s just an example. Great. We’re getting some feedback. I’m sorry for those that are are stuck, but there will be time for you to catch up here.
So we’re gonna try to identify root causes quickly and efficiently here, right, using this technique.
And the first thing we’re gonna do is I work for Skillshare.
Sorry. Let’s get the prompt up.
Let me paste that. So this is what I want you to put into the perplexity window, And I’m going to give you one minute to run that.
Awesome.
Download link pending chat. Thank you.
So this is just a very simple prompt to set the context. Right? Just before you start going off half cocked having a a chat with AI, you want to make sure that it understands the thing that you’re talking about. This is where clear instructions help, but this also helps to set the context. Right? So if you are gonna have a conversation about some data, you need to make sure that you’re actually talking about the same thing.
So this will have given you a response talking about who Skillshare is. And what we’re gonna do is we’re gonna paste in this prompt, and what you’re going to do in your perplexity window is paste in this prompt and then attach the file. There’s a little link in perplexity.
You will see it you will see it here, and then you can attach text or PDF files. Right? In this case, you will be attaching the Skillshare PDF file. So I’m going to give you a couple of minutes to do that.
So what you’re gonna do is you should have received a whole list of stuff back from perplexity there. And what we’re gonna do is we’re just gonna tidy, so you’re gonna type in this prompt. I’ve just put prompt number three into the chat, present the issues as you listed, you listed as a table, and that will tidy up that list and make it a little bit neater. I’ll just give you a minute to do that.
Excellent.
You should then have a nice little table like this. Right? And I’m going to give you a minute to just read through it. Have a look at the table. Right? Tell me any thoughts about it.
What do you think? Is it is it getting to the detail you want? Is it what’s missing? What’s wrong?
Cool. So you you you you should have, have a table there, of this stuff. Right? But one of the one of the, the issues here, right, I’ll come to in a minute.
But, essentially, what you’re doing with this is bootstrapping problem exploration by using a raw source of customer feedback. Right? And this this could have been call center data, happens to be some reviews data posted online. But it kinda gets us closer to the finishing line, less effort to get to our goal of understanding what’s inside that huge text corpus.
Right? Because usually that’s very hard to process. But here’s the problem. Is it accurate? And we can talk a bit about hallucination and q and a.
I’ve got a couple of slides about that. But you need to define your validation criteria, right, for checking this stuff. So does anyone have any ideas in the chat? How would I go about validating that the stuff in that table, is actually real?
Right? Can anyone tell me?
There is a way of validating this particular example. Ask it to cite its sources. Yes.
So if we said to it and and this is just, like, a bonus prompt. You don’t have to type this in. But if we said, for that thing, right, one of the root causes, provide the quotes that are relevant to this, then you can go and check that those quotes actually exist in the raw data. Right?
Some chat tools that you use will actually paraphrase this, right, including the latest model of GPT. Right? So ways of checking this would be to check if the quote exists in the raw data source. Right?
If it doesn’t, then you can prompt, I don’t see this quote. Have you paraphrased it? Give me the actual quotes from users, or you could do a quick keyword search. Right?
You can do this to prove out that the model is actually working. Right? And we will talk about hallucination later because this is something that you’re guarding against. And this can happen when you’re arming the kinks out of a system until you’ve got it working happily.
Expect to hit problems and expect to validate and check stuff continually. Most of the work that Iqbal and I have put in over the last year has been testing things thousands and thousands and thousands of times, and them not working. Right? But all of this is in the book. All the good bits are in the book, not all of the failures.
So these raw text inputs are an excellent source for understanding the structures of customer problems or delight. Right? And you can use positive reviews for writing customer content. Right? You ever thought about that? Uploading some five star reviews will help you write some genius content for value propositions.
But we’ve shown you here kinda how to bootstrap problem exploration, but you can use this technique in many different ways.
But one of the important things here is you’re priming AI with good quality data input. You’re not just saying, hey. Let’s talk about cancellation and stuff on this brand Skillshare. Right?
It’s not a high level discussion. You’re saying, here’s a lens through which to actually see the problems that customers are reporting on the site, and that could be multiple data sources. But it also helps bootstrap your own thinking and focus on real user problems, not imaginary ones. Right?
So you must validate the output so you can trust the results, and it’s good to have reflective prompts in here to get AI to check itself. A bit like asking it, are you paraphrasing that quote? Can you go and check it, please?
We are going to, what if no source is present? Well, you’re just having a generic discussion there, but, you could upload sales figures, contextual information, some presentations, some documents, some internal corporate information.
I’m sure there are some additional sources that you can add there.
You’ve got to start with customer problems. Right? If you all sit around in a room thinking, hey, me and my colleagues know exactly how to solve all of these, then you are truly lost. Right?
So breaking this stuff down into smaller issues is our next step, because one of the problems that you some of you correctly identified is this stuff is too level, too high level. We only went one level deep on this. We actually need to go even deeper we only kind of skim the surface of this problem taxonomy.
So what we’re going to do is we’re now going to paste you’re going to pick one of the root causes in your table. So if you scroll back up to your table and find a root cause, one that you are curious and interested about, and break that issue down into sub issues. Right?
You will find the prompt in the chat window. Please paste that prompt in and replace this bit root cause with the one that you like from your table. And I’m going to give you two minutes.
So you you should have, gotten a problem statement coming back. A couple of you ran into problems.
Yes.
MISA, opportunity solution, trees, these all relate to this work. Yes. Absolutely.
Yeah. And we are doing this exercise on perplexity because it doesn’t require a paid subscription.
There aren’t any other advantages.
These prompts actually weren’t fine on GPT. It came from GPT originally.
So what you’ll get back here is you will get a breakdown here of the sub issues. You’re going one level deeper. Right? And there’s a whole process of exploring this taxonomy, and these structures can be quite big and complicated, but you begin to get down to the the indivisible problems and how they all relate up that tree.
Getting AI to write problem statements for you is a really smart idea. Right? Please emphasize this totally. Right?
Because it’s it’s the same with writing hypothesis. It’s forcing some critical thinking around the topic. Right? By having problem statements standardized, predictable, easier to browse, look at, have in your backlog.
Right? Everything becomes a lot better from here. So it shakes out gaps in the data research or critical thinking at a really early stage in the process. Right?
Someone like me isn’t worrying about this after the experiment is run. Someone’s been made to think about it when they’re coming up with the solutions to the problems in the first place.
And there’s a bonus, prompt here.
You can actually take the sub issue and break it down to further subissues and ask it to provide a list of questions. Right? So this will explore with you and think of this as a conversation you can have about additional research that you can do, some data that you might need some measurement in this area that might be really important, right, to even think about coming up with a potential solution for this in the first place.
So, let me just cover the, highlights of this.
By going into the sub issues with that prompt, we wanted to kinda break the larger issues down into a bit more detail so we got some component parts.
The problem with asking AI to come up with ideas at the this very high level is it’s too high level. Right? So it will tend to solutionize, and we need to get more specific on the problems.
If we guide AI with an intermediate step of breaking down these high level issues and defining them more clearly, and we ask how to validate and ask questions about this and research these areas, then we’ll actually come up with some decent problem statements. And if you have a good well worked out problem statement, then it’s gonna be a lot easier to come up with ideas and hypotheses, right, for product changes and experiments.
So AI is helping you here to understand customer problems, break them down, and create a kind of taxonomy or tree of these problems.
So we kind of explored problems in terms of an issue tree or an opportunity solution tree. We also showed you some prompts for getting further research, doing further research or asking more questions to get a better understanding of problems and fleshing those out. And we also created a list of standardized problem statements in a specific format.
So we’re now gonna do the most fun bit of all, which is the ideation part. Let’s get stuck into that.
I would like you to pick one of the problems that we identified going back to you have it in your table. Right?
So one of these issues here, one of the sub issues. Right? I would like you to pick one of those. Okay?
And looking at that sub issue yourself, I would like you to open up, a notepad or a word document. Right? So it might be something to do with cancellation. It might be something to do with customer support, But pick one of the items from your table, and I want you to write down as many experiment or, ideas or changes to the product that you think will shift this problem, solve this problem, or shift the behavior. Right? So three to five ideas, and I’m going to give you five minutes to do this. I will be on chat if you have any problems.
Great. That must have been so super easy.
You must have hundreds of ideas already.
Keep that list because you’re gonna compare it in a minute. Alright?
I don’t know about you, but I find that uniquely painful. Five minutes. Right?
But that’s good. Right? I just want you to feel how hard it is to come up with lots of ideas. Right? It gets harder the more you do it.
So, what we’re gonna do is we’re gonna take, the one of the problem statements in your table. Right? And we’re gonna ask AI to come up with ideas. So you’re basically gonna take the same thing that that you just generated your ideas for, and you’re gonna get AI to generate ideas for the same thing. Right? So I’ve put the prompt in the chat window. I’m going to give you three minutes to run this and have a good look at it.
Right.
There we go.
You you can ask for more than these. The number that I asked for back was, eight, but you can ask for more if you like. It’s pretty manageable.
How do they compare to your ideas? Right? Are they more closely related to problems? What’s what’s different about them? I’d like to think about how would this work if you were using it with your team? Would it add extra ideas, bring anything to the table? Would you consider maybe more ideas or, different types of ideas?
So, what what I would like to do for our next one is show you just a little extension prompt to this. Right? So, hopefully, what you will have discovered, like, the other people in our workshops is that it isn’t that the AI list is better. It’s just it contains things that you might have forgotten.
You think, oh, yeah. Yeah. That’s a really good idea. It may be something you didn’t think about.
It may be something you should have remembered.
It may be something absolutely obvious, but everybody’s missed it. Right? The point is is that your list combined with the AI list is better than either person’s list. Right?
And that’s the point. But now what we’re gonna do is take the guide rails off a little bit and get AI to go slightly outside the boundary of what it would normally suggest to you. We want disruptive or crazy ideas. So what I want you to do is put in this additional prompt.
Right?
And say, give me eight disruptive ideas to solve this problem and think outside the box. Right? And please, share any that you find humorous. I’m going to give you three minutes to play with this prompt. You’ll find it in the chat.
Yes. If you see things involving geostationary satellites, blockchain, AI powered chatbot for subscription queries.
Yeah. We’ll pass it to the chatbot.
No one will ever cancel because they won’t be able to figure out how to cancel with the chatbot.
I love that one.
Social renewals. Right? So you can, like, user could opt to have their friends remind them about upcoming renewals, leveraging social pressure to increase awareness.
That’s hilarious.
Refund crowdfunding.
Get people have a whip round to pay for your subscription.
Yeah. I like the subscription swap marketplace. I saw that one yesterday.
Owen, there’s got to be crypto in there some way, some crypto stuff.
Now there was one very interesting idea someone pasted in there already.
That was that was funny, but then it wasn’t funny. It was interesting.
Some of these are so funny and crazy. They’re just no. It’s never gonna happen.
Some of them may trigger another thought in you.
Subscription insurance is an interesting one. What about renting an apartment insurance so you don’t have to pay a deposit?
That’s a good one.
Right. So you pay escrow, so you don’t have to pay three thousand euros for your deposit for your new apartment. You just pay a small insurance fee every month. That’s an example of an idea that came out of one of these stupid ideas.
So you should all you should all have seen some pretty crazy stuff in there. We’re asking, like, AI to definitely step outside the bounds of where it would normally go. Right?
These are really funny. Right? But if you actually look at some of these ideas, they have a grain of another idea inside. Right?
So one of the ones that someone posted was if Skillshare integrated with smart home devices. Right? But when you read that, you think, no. That’s not really relevant.
But then you start thinking, well, maybe if we allowed people to, do some of the training on those devices by listening to it, maybe we should be integrating and partnering with those people. So that’s one idea. Another one that came up on our workshop was it was for an events company, and they do events, like, all around the world, like, millions of events every day.
And one of the ideas that came up was for a TV channel, an event space TV channel. And everyone was, like, laughing. That’s really stupid. Who’s gonna watch a channel about freaking events?
That’s gonna be really boring. But the CMO said, no. Actually, that’s a really good idea. Right?
Because we could hire a videographer. We could set up a YouTube channel, and then we could create a whole series of events that are happening over the course of the next month and send them round to all these events to showcase what we are actually doing as a platform. Right? We are making these events happen, and we are providing the glue in the infrastructure.
So that was a fifteen minute conversation about how to do that. So the crazy ideas that you’re showing there, so a lot of them will be really crazy and not doable, but some of them may have the germ of another idea that will start a conversation.
And here are some bonus prompts for you to play with later. Right? If you like one of those ideas, a bit like the maybe the events, idea or a bit like the one I pasted in there about smart home devices, Ask it ask AI to analyze that, further, right, and reveal the key themes that play and then create ideas based on that theme that’s actually driving that idea that you really like. Right?
And this ideation stuff could be really helpful to teams. Right? By doing this kind of asking for disruptive ideas, crazy ideas, you could say I want sustainable ideas. Right?
You are you’re you’re asking it to do something slightly different with the ideation either to filter it or to totally change it or to widen the diversity of ideas that it gives you. Right? And that diversification usually ends up pushing some really new and innovative ideas into the stack. So if you’ve got your list, you’ve got the AI list, you’ve got the crazy I ideas list, All three of those together are a really good list of ideas for you to start with.
So AI with your guidance is very creative. Right? There are scientific studies now showing how creative it actually gets in terms of making up connections and coming up with ideas.
You will, through a process like this, find greater diversity in your own ideas and those that your colleagues can come up with, And you will also end up including ideas that are not just coming from your solution bias or your familiarity with the product. There’s more objectivity here. The other thing is you’ll include ideas and discussion that most humans would have been too embarrassed to bring to a meeting. They would never no one would have ever sort of put their hand up and come up with that events channel thing because they would have been too embarrassed. Right? It’s ego.
And it will also help include ideas that you probably all forgot, right, that you should have known about. So you are are allowing AI to go really wide here, like, go loose and wide, and then you’re doing the filtering and constraining to come up with a bigger list. And this whole process is a great technique to expand ideation, but also connects these back to the original problem through the problem statement, right, and the research that you did to arrive there.
So I would just like five minutes to get your feedback on these three questions.
And, please tell me in the chat about these. I’m just going to run a five minute timer here and answer some of your other questions, but please tell me feedback on one, two, and three in the chat please.
I’m going to enjoy reading through all of those later. I saw quite a few of them as as they passed.
Chris Gibbons has actually spoken about this. Why are we prioritizing experiments? We should be prioritizing problems, right, as defined by problem statements.
And this is a good way of thinking about it. Right? There are opportunities in doing this work, and you can size them. Right?
So there’s no reason why you can’t be prioritizing around problems. Right? Imagine that. You know?
What a change.
Last to finish off this session are two custom GPTs that you can use. Right? If you wanna have a conversation all about your hypothesis, you want help writing it, you wanna chat about a change that you’re making in a test, and just talk through what you’re doing, helpfully tighten it up, or any other questions really into a hypothesis, then please use this helper, and I will put the link in the chat right now.
There we go. There’s the hypothesis helper.
And the second custom GPT I’ve got for you is the hypothesis checker. And, sorry. Let me grab the link URL and put that in the chat. And, this actually uses, Stefan Tomke’s hypothesis evaluation framework.
Right? See if you took your experiment backlog and put it through this. Right? It will probably tell you that lots of them need to go and have a chat with GPT.
Right? So put people’s hypothesis through this. Right? And if they need to, they can then go and have a chat about their poorly written hypothesis with that.
Right? So this is all you need. And the thing is, this helps people who are less skilled, less experienced, who are just starting out this kind of work to put together better hypotheses and experiment designs. It will even have chats with you about the metrics that you’re thinking of.
So this is smart work. Right? You’ll get better quality experiments by using these two.
And all of this today, right, the future is already here. It’s just really badly distributed right now, but it is happening. And some of the stuff that we’ve shown you today are just the tiny little Lego building blocks.
So, please give us feedback in the chat and the workshop. I’m also going to invite, up on stage because he’s backstage right now, for any questions that you might have.
I can start answering some of the questions while I wait for him.
Yes, please.
Why perplexity rather than chat GPT or Gemini? Because, it’s not paid to attach files. Right? I knew there’d be a lot of people here for the workshop may not have a GPT subscription, which you need to upload the files. There are different rules for different tools, but that’s the main reason.
Yeah. Chaining instructions sometimes leads to weird responses from MLM bots. Can talk a bit about that, but, yes, chaotic responses and hallucinations, these are these are driven. I I I it’s it’s multiple things that causes these. My analogy for this is that you’ve fashioned the fish hook wrong. Right?
Or there’s no data in the lake. Right? So hallucination is trying to please you. It’s where our AI is trying to please you with an answer where there isn’t one in the lake or or your query is wrong. Right?
So most of these kinks you should try to sort out in a workflow before you turn it into a production system, but that production system will need live checks and reflection. Right? This idea of getting AI to recursively reflect on its own outputs is actually helping the training models to improve.
I see the prompt dates.
How do they manage to do that if they can’t count? Well, it does it by roughly knowing the sizes of the piles. Right? So AI here is it’s a bit like card sorting. It’s really good at putting similar stuff into piles and knowing that some piles are sort of vaguely bigger than the other piles.
But it won’t be able to tell you what nodes in that taxonomy only have one customer quote in there. Right? For the counting, right, you there are two ways of doing it. You can either go into the feedback and ask it for all the customer quotes in that node, in which case if it only comes back with one, then you know there’s only one item in there.
That’s the way of sneakily getting AI to tell you how many are in that node. But if you actually ask it to count the items in that node, it can’t do it. Right? For this, you need an SQL database and a vector database and text processing.
XPAL has actually built a tool to do this. We’ll take the raw text data and then allow you to browse and tune that taxonomy that’s inside there.
There are a few questions on the chat panel as well. Yeah.
I’m I’m I’m looking through the q q and a right now.
If you use a similar prompt technique yeah. Sometimes, these these prompting techniques came from, GPT, and I I did very minor tweaking to get them to work on perplexity.
You may tune them and find that you can extend them and make them work better for you in specific context. So it’s just a starter, point.
But, yeah, this stuff will work on other, engines. In the, in the slides, although it was turned off from presenting today, there’s actually a a list of, all the major tools and what they are particularly notable for, what they’re good for.
As the AI is unable to count the issues, how might we prioritize the ones found to make sure we are spending our time on the right things?
For that, you actually need a multistep processing system to actually count the stuff in the text. Right?
That’s the only way to prioritize this.
You can get an idea for by probing it, about the amount of quotes that you can get back from these nodes. But if you want to do this over a large dataset, there’s only one way to do it, and that’s, through using a tool.
Can you go back in your chat history?
Yes. You can. Can you go back in your chat history to start asking from a specific way earlier point? This is a really good point.
You can do this in GPT and some other tools. So if you get a bit lost and you think I’ve gone down a rabbit hole here, and I really wanna go back to where we were, like, halfway in this conversation and go to a different direction, you can go and restart the conversation from that point. Right? And it will just reset it.
Right? So you can carry on from any given point. I don’t know about the other tools, but you can certainly do this to GBT.
How can we apply this prompting to generate AB test ideas? Well, if you feed it all your customer problems and heartache and gripes and unhappy emails and negative reviews and all the bad things that you can find about your customers and then process it, and then pull out the problem taxonomy, get some idea of how many people those problems are impacting and kind of prioritize them, you will have tons of ideas for an AB test. Because usually everyone’s thinking, we need to increase our conversion rate. What can we test to increase our conversion rate?
Right? But that’s not connected to a user problem. That’s you connected to your problem, right, which is getting the conversion rate higher. But you’re not gonna get the conversion rate higher until you actually solve user problems.
Right? And that’s why this process gets you better ideas than asking your colleagues for their random ideas on how to solve the problem. So, yes, you can apply this to to come up with much better ideas in AB tests, and you can actually integrate it into your ideation tools or your experimentation backlog tools.
Is there a recommendation as to oh, sorry. I’ll answer this one live. Is there a recommendation as to how complex the uploaded files may be or how far they should be broken down?
It it helps if you don’t upload all of the same sentiment. Right? So actually picking really happy stuff or really unhappy stuff, so maybe one or two star reviews versus four or five. So happy reviews are used for content generation.
Like, if you want to write some great landing page for, say, a Disney experience, all you do is you upload twenty thousand reviews of people talking about that Disney experience. Right?
And then ask it to write the content for a landing page, and it will use all the customer language and value propositions to write it. Similarly, with the negative stuff, you can actually turn that into something that allows you to create opportunities and experiments that are going to solve real customer problems.
GA four replacement for CRO analysis. Not quite sure what that one is. If you could give me a bit more detail.
Can AI be biased towards your existing beliefs? Yeah. Of course, AI has bias in it because it’s got all the bias of all the human stuff that it sucked in.
But there are three scientific studies showing now that AI generates a greater diversity of ideas. Right? That it it has a bigger selection, a wider range of ideas than even large groups of humans. Right?
If I gave you if I’d given you an hour earlier on to come up with more test ideas, right, I bet you you would have hardly added any in that last hour. Right? It gets harder and harder to come up with new ideas for that thing the longer time goes on. So, there, you can get some really good results here from filtering that stuff specifically.
What other data could we feed the AI to help with problem identification and idea generation? Any background, any contextual background, feedback documents, data as well, GA data, all sorts of things.
Yep. Iqbal. So Iqbal is there, so I’ll just welcome him onto the stage now.
Hey. You made it. How’s it going?
Everyone say hi tech, Bal.
And say thank you, tech Bal as well.
Any questions? Shall I get you to the answer to the one about VWO tracking errors? No.
Any you any questions that you would like to tackle there, Bal?
Okay. Cool. Cool. Cool. Perfect.
Yeah. Got some questions. There was a there was a, quite a few people asking about prioritization.
And just to kind of cover that aspect of it, like, there’s some limitations of this approach. So just to cover the limitations, AI is currently not good at prioritizing, scoring, quantifying the number of times people have mentioned x, you know, problem or whatever.
So there are limitations of this approach, but this is good as a general sort of, like, a beginning step of ideation. And, also, they it does have a limit in terms of the, number of records from the, CSV that it can that it can search. So, just just bear those the those things in mind, I’d say.
Yeah. There was some, there was a good question earlier on perhaps you could tackle that. How do we control the chaotic outputs and hallucination that comes with this?
There, so, basically, there’s a, validation step that you do need to take. So, as before you even engage with AI, you need to understand how am I gonna validate whatever output it’s gonna give me. And I think the the example that we, went through is that you you ask it to give you some quotes.
We didn’t go through the full validation step, by the way, which might be why this question has come up. So you get some, raw quotes, and then you basically cross check it. Now in terms of how do you control the wild hallucination, it’s kind of like, you have to kind of like ride riding a wild bull. Right?
So basically, you you have to kind of get it get it back to, to the point in terms of try to get it to reference specific quotes, try and, catch it when it’s, when it’s not being accurate, and be specific in the areas in which it’s which is not being accurate. So is it summarizing text instead of giving you actual quotes? Well, pull it up on that. And, and as soon as you pulled it up on some of the things that it’s doing, it should start to behave.
At some point, it might go a little bit too crazy, at which point you might have to start a new chat window session or go back, up the history and edit from a previous, point and then just go from there.
Sometimes client, clients comes up client comes up with solutions generated by AI.
How to handle such situations?
How can we do stuff when clients come up with ideas that they wanna turn into tests? Well, I would probably encourage them to actually look at the method for coming up with ideas we’ve shown here, because then it will actually relate back to a problem, or you should teach them this technique, or get them to think more critically about the ideas.
Yeah. The it’s it’s worth also adding that this is a very quick sort of problem discovery, problem exploration.
What you should get from the, from a result of this is a bunch of questions to then further dig in, further do some more research.
So when, when clients do come up to you and do this kind of crazy do ideation on the side with AI, you do have to bring it back to user research.
So from using this technique, it will give you some more, ideas about questions, further user research you might need to do. Go away and do that research, whether it’s user, whether it’s through surveys or whatever. And then, kind of, you know, get to learn a bit more about the, the problem. Somebody mentioned, that this is about MECE, m e c e, mutually exclusive, collectively exhausted.
And that is yeah. That’s bang on the money. That’s essentially what we’re doing. We’re creating an issue tree.
And when we when we, exhaust a node of a particular, issue, then we’ve mutually have exhausted that sort of area, that node of information.
I think we’ve covered most of the questions here, actually, Vipul. Anyone got any any last questions that you want to post to us? We think we’ve got a couple of minutes.
Yes. So we do have, like, five, six minutes. Let me just quickly, skim through the chat box.
I did see a few questions coming in. I’ll just look at the screen one second.
About consistency of responses, Ikbal?
What was there about consistency response?
Do you have any tips on managing consistency of responses when you create a workflow that uses AI?
This is a tough one. You it’s always gonna be inconsistent.
The, and it’s very dependent on the dataset. Some datasets are more inconsistent than others because the language isn’t, isn’t very specific. If the language isn’t specific or if the language kind of could be taken in slightly different ways, then AI will interpret that in different ways.
In terms of to try and get consistency, you have to just do it again and again and again.
So in terms of, like, and then take the the average in terms of, like, what what does it say most of the time? It’s it’s not perfect. Like I say, it’s go like, going back to the original thing that I said off the outset. There are limitations with this approach, and you have to really understand, those limitations. And it’s not going to be perfect, by any means because, you know, ChatTpT and the way that it kinda analyzes text is gonna go down some rabbit holes.
Stuff changes on us. Right?
So, latest model at GPT started paraphrasing some of the customer feedback, which you mentioned in the talk. But your prompt that works now may stop working when a new model comes out. So this is something you have to watch out for.
And one of the best ways to solve all these kinks is not to, like, hey. Let’s build a production ready system. No. Let’s iron out all the kinks and set up a process. Let’s do it in chat first, right, and work out all the kinks and get the prompts right and everything else. Once we’ve run it and got it consistent and reliable with a particular model, then we can start moving it towards production. Once we’ve armed out the bugs in it and added the necessary validation steps.
And on last point on the consistency thing, this is the reason why I had to build my own tool because you have to the the there is an answer to getting it more consistent, but you’re not gonna like that answer, which is really, really complicated, really long drawn out process.
So yeah. Yeah.
Talk to us.
Messages in LinkedIn about that one. Yeah. There’s another question here. Can AI draw insights or conclusions if you give an info from an experiment? And the answer is yes. Right?
Two of my clients are using Gemini to actually report on tests, and they’re telling me that about sixty odd percent of their tests can now just be signed off by the analysts who were previously sort of having to review them and call them. Right? So that’s taken a lot of boring work away from them.
What I would like to do, I’ve approached some AB test vendors of, like VWO to help as well, is to give me loads of data from experiments that have been declared properly. Right? And then I’m gonna run them through all the AI tools and compare their accuracy scores with where humans have declared the experiments. I’d like to see which ones work better.
The thing I’d add to this is this this falls into the trap of, mistaking automation for AI. So, basically, this is this is a task for automation.
So there are tools out there that can already like, because, basically, in terms of gay gathering insights and kind of, deciding whether or not a test is a win or a loss, that is a formula. It’s a statistics formula. It’s a very specific statistics formula. It’s very easy to do. You just need to put it into a calculator. So, so in terms of from that perspective, it’s an automation problem, not an AI problem.
AI might be able to do it, but, basically, you know, behind the scenes, it’s kind of probably just gonna be end up using a calculator, to to do the calculations.
So you might as well do that yourself or automate that that process.
Answers to Val Val, Val, John in the chat, use the playground, the chat GPT playground, and you can play with parameters there.
That was a question just asking. How can you play around with fine tuning some of the parameters? That’s very geeky, and you can have fun in the playground.
Play with the temporary think.
Yep.
Craig, it was a a wonderful session. Thank you so much for building, this workshop. I really love the responses from just everyone, on this on this, call. So thank you so much. We have already, yep, reached the two hour mark. So it was quite a comprehensive workshop, I must say.
Thank you so much. Thank you so much, Craig, and thank you so much, Ibal, for joining us, joining Tempek two thousand and twenty four. It was really lovely having you both.
Thank you, sir.