Prefixbox AI Agent
Prefixbox is a Hungarian e-commerce scaleup. We helped them design an AI agent that gives more accurate product recommendations to shoppers with generative AI. Throughout the agile design process, the team has devised solutions based on field studies, competitor analysis, and market research to focus and iterate on the most essentail features in a rapidly evolving market segment.
The Case
Prefixbox’s product vision was to manage and enhance the shopping experience with better recommendations, lead generation, and customer support. The aim was to answer common shopper questions like:
- Can I buy this product in the nearest shop to me?
- If I order this today, will it arrive until Friday?
- Give me gift ideas for a 4 year old girl’s birthday
- Payment does not work. Could you assist?
with dramatically increased certainty, accuracy, engagement, and with minimal or no help from the support team.
The minimum viable AI agent solution we worked on aimed to recommend and suggest retail goods more accurately and naturally based on personal needs and preferences.
The challenge was to design the agent in an agile startup setting within a scarce competitor landscape. When we started working on this project, very few products used generative AI as their underlying technology. Building a white-label product from scratch with generative AI and finding user and business value use for it was a substantial challenge.
We carried out shopping field studies to better understand retail customer journeys. We also analyzed direct and indirect competitors for potential market opportunities and best practices. We created a business model canvas and categorized features based on their impact and effort to have a clear vision for the minimum viable product.
Based on these, we developed a chatbot persona and designed conversation flows and prototypes to have user feedback as soon as possible.
Discovery & Empathy Phase
We started the project by going to retail stores to understand customer behaviors. Instead of conducting in-depth interviews, we replicated typical shopping scenarios to empathize and engage with our target audience.
We bought a wide range of consumer packaged goods, and took notes of the shopping experience and shopper’s conversations with shop assistants, customer support representatives, and cashiers. We asked tough questions, created scenarios where we considered multiple products, bought FMCG, CPG, and consumer electronic products, and went through all the touchpoints within the store.
Through this process, we collected unique examples and opportunities to enhance our agent and to make it feel more human.
At the time of the project, there were hardly any generative AI agents recommending products based on user needs. We struggled to find direct and indirect competitors and tests and benchmarking were challenging.
Competitor analysishelped us to collect market opportunities, risks, and conversation design best practices. This part of the design process was full of valuable insights.
Best practices for generative AI for commercial products were scarce. However, research showed us that AI struggled with generating recommendations for a wide range of different kinds of products. It worked best when the product offering was specific and limited. This way, data was more accurate, and we could better train the AI to be more reliable with the retrieval-augmented generation technique.
Generative AI product recommendations worked best with just a few specific products.
Another key insight was that generative AI excelled at product recommendations, but it was not the right choice for the entire conversation flow. Products worked best when they mixed generative AI and conventional rule-based techniques.
Along with usability issues, using generative AI for the entire conversation would have been an example of forced innovation. This seemed overly risky in an uncertain environment. We aimed for a more transitional chatbot experience and used generative AI only for the parts where it seemed to benefit user needs.
We aimed for a more transitional chat experience, and used generative AI only when it benefited the users.
Defining Features
We used an impact effort matrix to focus on features with the highest user value, were the most unique compared to our competitors, and were still viable to develop in a short period to go to market as soon as possible.
This process helped the team focus our efforts on just a few key features. It also helped with initial estimations for design and development. We focused on product search, discovery, and recommendations with a well-established shopping cart feature. We like saving products for later, favoriting them, and implementing features for returning users. We also cut customer support from the initial scope.
Conversation Design
Before designing conversation flows, we needed an agent persona to write the messages in a proper, relatable, and consistent tone.
This stirred long discussions because we were unsure which solution was better: designing a generic persona each business could use or providing settings for custom personas. A generic persona seemed like the safer bet, but providing custom chatbot personas would surely be more engaging and scalable later.
After much consideration, we decided on a generic chatbot personality for the MVP. However, generic did not mean that it did not need any work. A generic personality is still a personality that we had to design, document, and create example sentences for that developers could use as a benchmark for our generated and built-in messages.
A generic agent persona is still a persona that we had to design.
We designed a friendly, empathetic persona guide for the system prompts to future-proof the product and to address potential exploits and harassment that are all too common when people start using chatbots.
With the chatbot persona in place, we started sketching conversation flows: the low-fidelity wireframes of conversation design.
Because we wanted to have rule-based conversations and intent recognition for the most part, it made sense to sketch these out first. We designed example scenarios for product recommendations with generative AI and simulated these during prototyping.
We rapidly iterated on conversation flows with developers and stakeholders to have a shared vision and understanding of user needs, conversational design best practices, and development requirements and estimation. This reduced development costs, uncertainty, and risk.
We designed happy flows for product search, recommendations, filtering, and listing with a simple shopping cart feature. Along with happy flows, we covered as many error scenarios as possible. Errors will occur during conversations, but if a chatbot can get things back on track, it can make a profound difference in user experience and drop-off rates. Covering these was a must, even for our minimum viable product.
The Challenges of Generated Content
Based on the conversation flows, we quickly started prototyping within Voiceflow, a software for building and designing chatbots.
This helped us showcase the product recommendation conversation flows. Voiceflow’s advanced generative AI features allowed us to create a prototype similar to the end product vision. This allowed for quick testing and iteration for this feature, which we could not test or replicate with simple conversation flows.
Even with a limited number of example products, we could create a retrieval-augmented generation (RAG) technique within Voiceflow. Compared to using a large language model as is, this technique helped us decrease and almost eliminate hallucinations and inconsistency. We used additional product data to make the recommendations more accurate and fine-tuned.
Relative-augmented generation eliminated many of the uncertainties and pitfalls of the product recommendations compared to using generative AI as is.
Conclusion
Despite these circumstances, we delivered a working prototype for implementation and further testing. We also tried to share as many conversation design best practices as possible to help developers understand and use core principles.
Generative AI for commercial products was very scarce at the time of the project. The market has completely changed since: generative AI integration has become more widespread with more examples and best practices. We would start this product design process differently now.
Prefixbox has a well-established prototype that they can use for pitches or a minimum viable product. Their domain knowledge in this field increased dramatically, and they used this in other products since then.