Improving Relevancy in RAG Model for Top Contextual Conversation Startup

Model Evaluations and Search Prompting

A top 50 startup, serving millions of users, develops personalized and contextual RAG chatbots that can participate in group conversations, with each chatbot conversing as a specific persona. Invisible performed evaluations and provided improved search prompting, enabling the client to enhance model performance and context efficiently.

Conversations Evaluated per Week: 300

Problem: Chatbot Responses Lacked Accuracy

The client faced a critical issue: their chatbots, while performing well in one-on-one chats, struggled in group settings. Customers noticed that the personas of these chatbots would shift unpredictably, creating confusion and undermining trust.

The client needed a strategic partner—one that could scale, ensure quality, and provide precision in assessing and refining chatbot responses.

Solution: Expert Human Review and Prompt Generation

Invisible's team worked within the client's platform to review model conversations from the research team and assess the level of accuracy of the model responses. We evaluated conversations for flow and factual accuracy, and provided revised search prompting.

These prompts were especially important and niche for the client - prompts had to match the tone and persona associated with their chatbots. Subsequent model responses were improved by integrating these new search results.

Impact: Improved Relevance

By guiding the AI to seamlessly incorporate search results, the client’s RAG chatbot transformed the quality of group chats—making them more engaging and deeply relevant.

"Invisible is flexible and attentive to our needs. They have been close working partners and have been able to work closely with us to improve processes and annotation outputs."
- Head of Trust and Safety

Get expert insights into your unique challenges

Request a Demo

Related Case Studies

Learn how Invisible drives growth and scale for clients across a range of industries and use cases.