Build It Or Buy It? Generative AI’s Million Dollar Question!
One of the most popular patterns to emerge in Generative AI is the “RAG Pattern”. This is where context from your ground-truth knowledge bases (like documents or helpdesks) are included in the API call to the LLM to form a response based on that context.
Most developers instantly turn towards frameworks like Langchain or use the ChatGPT API directly to build a quick prototype using their PDF documents.
Sounds great, right?
But the big question is: How do you then take this to production?
How do you deal with all the issues related to data ingestion? Issues like document handling failed parsing, connectors that you need to deal with (e.g., Youtube!), or document formats (try doing OCR on a 100-page PDF!). And more importantly, How do you put a management interface on top of all that data and make it accountable?
And data ingestion is just the start. What happens when the responses start hallucinating and you get calls from managers and customer support (for real!)
Or what happens when someone at your company asks you to justify the response the AI just spit out? Do you have citations? Do you have transparency about how the response was calculated? Are these audit trails instantly available without MLOps investigating each issue?
Bloomberg might have built his own LLM (apparently at a cost of about $100M), but is “Build It” the right choice for your RAG system? A system that you will then run, operate, debug, and maintain.
In this blog post, I present an alternative: Buy It!
Just like you let OpenAI deal with the LLM, you can let someone else deal with the 100 issues related to operating, maintaining, enhancing, and supporting an RAG platform. Ideally, at 1/100th the TCO cost.
And go live in production in *minutes* — not days or months. A cloud RAG SaaS platform lets you do this and instantly deploy your solution with your content — without getting into all the intricacies around deployment and MLOps.
Whether you go with a cloud-hosted platform or an in-tenant one that runs within your own tenant/VPC, the cost of running the RAG platform can be greatly less than building everything yourself from scratch.
Just dealing with hallucinations alone could save you thousands of hours of very expensive developer time. And just think about the cost of keeping your developer team around for months (or years!) to maintain and enhance your RAG pipeline as innovations in AI continue.
Technical Details
Here are some components you will need to think through — whether you build the RAG pipeline yourself — or use a SaaS platform.
Building the RAG
By using a no-code or low-code RAG platform, you can ingest ALL your data sources to create a comprehensive ground-truth knowledgebase. Whether it is documents (in 100s of formats) or web content like helpdesks, websites, Youtube videos, podcasts, etc, you can quickly create a comprehensive knowledgebase in minutes — rather than dealing with data ingestion issues.
No kidding — we’ve now spent 10,000+ hours dealing with data ingestion issues. The number of issues that arise when it comes to data ingestion is quite amazing.
While all the media attention seem to be centered around AI, it’s good-old data ingestion that is the most painful aspect of building AI systems.
Querying the RAG
The good news about a “Buy It” solution is: Someone else has dealt with all the tough problems around hallucinations, citations and query relevancy. Aspects like these are by far the toughest part of building an RAG solution. Think about decisions like:
1. How does your chunking strategy affect hallucinations?
2. What effect do long conversations have on hallucinations?
3. Where did this AI response come from? (citation/sources)
4. What effect did your custom instructions (`system` value) have on response quality?
5. How does your anti-hallucination strategy perform when there is a sudden turn in the multi-turn conversation?
6. Do the chunks returned by the vectorDB search need to be re-ranked? And if so, what is the effect on the query relevancy and hallucinations?
And these are just the query relevancy aspects. Don’t forget about the MLOps, dealing with API issues, rate limiting, redundancy of APIs, jail-breaking, safety guardrails, etc.
Maintenance, Support, and Enhancements
A big question to ask yourself is: Is this a core competency of your company? Would you like to continue maintaining, supporting, and improving the RAG pipeline?
In a previous company, I developed my own CRM. Then, SaaS players like Hubspot and Salesforce came along and simply crushed what we were doing. Why? Because that is what they did day in and day out. They kept improving their systems. They kept innovating. They kept growing and learning from their large pool of customers. And ALL their customers benefited from the software that they kept building (economy of scale!).
Live Demo
To see an example of a RAG chatbot built using a SaaS platform, see: https://adorosario.github.io/dents/
This tool was built AND deployed in about 30 mins — using ground truth knowledge like Pubmed papers and curated articles. In less than a day, with zero coding, it was put in front of researchers and physicians working on this rare disease. The idea to deploy in less than a day!
FAQ
Q: Wait — do you have any real-life case studies?
A: Yes — see this case study with real results (lead generation!)
Q: So, what would you recommend Langchain for?
A: Langchain is great if the documents are ultra-private documents that need to remain in-tenant within your own Azure instance — in that case, Langchain is the best solution right now.
Q: How does a “Buy It” solution handle security concerns around sensitive data?
A: Most reputable SaaS solutions offer robust security protocols, including encryption in transit and at rest. Some platforms also offer solutions that run within your own tenant/VPC, giving you additional control over data security.
Q: How scalable is a SaaS solution in comparison to building our own RAG?
A: A well-designed SaaS solution is built to scale with your needs. It often benefits from economies of scale, providing faster performance and better uptime than individual deployments.
Q: What kind of support can we expect with a SaaS solution?
A: This largely depends on the provider, but most premium platforms offer dedicated support channels, extensive documentation, and regular updates.
Some may even offer tailored assistance for unique use cases or onboarding sessions. For example: In response to customer demand, we recently introduced a “Prototyping Package” to help businesses get started with GenAI projects.
Q: How customizable is a SaaS platform? Can it be tailored to specific business needs?
A: While a SaaS platform is designed to be general-purpose, many offer a degree of customization, from simple UI/UX adjustments to more complex integrations with other business tools. It’s essential to clarify the level of customization possible with the provider before committing.
Q: Are there any hidden costs associated with SaaS platforms?
A: Costs will vary depending on the platform. While some providers offer a transparent pricing model, always be sure to ask about any potential additional costs, such as for extra data storage (Pinecone!), additional API calls (ChatGPT-4 is expensive — hah!), or premium features (e.g., OCR, Team management, Chat Logs, etc.).
Q: Are there any invisible benefits associated with SaaS platforms?
Yes — there are two major benefits emerging:
- Software development costs: Does your company really want to manage a software development project (for the next N years!) — using a SaaS platform greatly cuts down the TCO costs.
- Time to market: By piggybacking on the tens of thousands of hours of development someone else has done, you can greatly cut down your time-to-market, especially if you are in a competitive market. With AI crushing some industries (like legal and software!), falling behind on AI is NOT an option for some companies.
“My fear is that if we don’t radically incorporate GenAI into our business soon, we will cease to exist” — CEO, Systems Integrator.