为 AI 重新定义数据库：MongoDB 为何收购 Voyage AI

Dev Ittycheria, President and CEO, MongoDB
February 24, 2025 | Updated: March 11, 2025
#ArtificialIntelligence

AI 正在重塑各行各业，重新定义客户体验，并改变企业创新、运营和竞争的方式。尽管大部分关注点在前沿模型上，但一项根本的挑战在于数据 — 如何存储和检索数据并让数据为 AI 应用所用。AI 驱动软件的民主化依赖于在正确的抽象层上进行开发，但目前，对于大多数组织来说，大规模创建有用的实时 AI 应用仍然不可行。

挑战不仅在于复杂性，还在于信任。AI 模型是概率性的，这意味着其输出不具有确定性和可预测性。这在当今聊天机器人的幻觉问题中显而易见，并且随着 AI 智能载体的兴起，AI 系统可以自主做出决策，这一点变得更加重要。开发团队需要能够控制、塑造和调整生成的输出，以符合其目标并确保准确性。

AI 驱动的搜索和检索是一项强大的工具，可以从特定来源提取相关的上下文数据，增强 AI 模型，以生成可靠和准确的响应或采取负责任和安全的行动，这在著名的检索增强生成（RAG）方法中得到了体现。在 AI 驱动的检索中，核心是嵌入生成和重新排序 — 这两个关键的 AI 组件能够捕捉数据的语义含义，并评估问询和结果的相关性。我们认为将生成、重新排序以及 AI 驱动的搜索嵌入数据库层可简化堆栈，从而为 AI 应用奠定更可靠的基础。通过将更多智能引入数据库，我们帮助企业减少幻觉，提高可信度，并在 AI 扩展上释放 AI 的全部潜力。

最具影响力的应用需要一个灵活、智能且可扩展的数据基础。因此，我们很高兴地宣布收购了 Voyage AI，这是一家在嵌入和重新排序模型领域的领导者，通过 AI 驱动的搜索和检索显著提高了准确性。此举不仅旨在增加 AI 功能，更是关乎为 AI 时代重新定义数据库。

为什么这很重要：AI 的未来构建在数据更高的相关性和准确性之上

AI 是概率性的 — 这不像传统软件那样具有预定义的规则和逻辑。相反，它会根据 AI 模型的训练方式和检索到的数据生成响应或采取行动。然而，由于该技术的概率性，AI 可能会出现幻觉。幻觉是检索不佳或不精确的直接后果 — 当 AI 无法访问正确的数据时，它会生成看似合理但不正确的信息。这是 AI 采用的一项关键障碍，尤其是在企业中以及在准确性不可妥协的关键任务用例中。

这使得检索最相关的数据对于 AI 应用程序提供高质量、上下文准确的结果至关重要。如今，开发者依赖于拼凑而成的独立组件来构建 AI 驱动的应用程序。这些组件的次优选择，例如嵌入模型，可能会导致低相关性的数据检索和低质量的生成输出。这种分散的方法对开发者来说既复杂、昂贵、效率低下，又繁琐。

借助 Voyage AI，MongoDB 通过使 AI 驱动的搜索和检索成为数据库的原生功能，解决了这一挑战。开发者无需实施变通方法或管理单独的系统，而是可以从实时操作数据中生成高质量的嵌入，存储向量，执行语义搜索，并优化结果——所有这些都在 MongoDB 中完成。这消除了复杂性，并提供了更高的准确性、更低的延迟和简化的开发者体验。

在 MongoDB 和 Voyage AI 之前和之后构建 AI 驱动型应用程序

Voyage AI 为 MongoDB 带来的优势

Voyage AI 已组建了一支以斯坦福大学、麻省理工学院、加州大学伯克利分校和普林斯顿大学为基础的世界级 AI 研究团队，并迅速成为高精度 AI 检索领域的领导者。他们的技术已经被一些最先进的 AI 初创企业所信任，包括 Anthropic、LangChain、Harvey 和 Replit。

值得注意的是，Voyage AI 的嵌入模型是 Hugging Face 社区中评分最高的零样本模型。Voyage AI 的模型旨在通过以下方式提高生成输出的质量：

通过创建更好地捕捉文本、图像、PDF 和结构化数据含义的嵌入来增强向量搜索。
通过先进的重新排序模型提高检索准确性，以优化 AI 驱动式应用的搜索结果。
通过使用针对金融服务、医疗保健、法律等不同行业以及代码生成等使用案例进行优化的微调模型，启用特定领域的 AI。

通过将 Voyage AI 的检索功能集成到 MongoDB 中，我们正在帮助组织更轻松地构建更准确、更可靠的 AI 应用，而不会增加不必要的复杂性。

如何将 Voyage AI 集成到 MongoDB 中

我们将 Voyage AI 与 MongoDB 的集成分为三个阶段。在第一阶段，Voyage AI 的文本嵌入、多模态嵌入和重排序模型将继续通过 Voyage AI 的现有 API 以及 AWS 和 Azure 云市场广泛提供，确保开发者可以继续使用其先进的嵌入和重新排序功能。我们还将投资于平台的可扩展性和企业级就绪能力，以支持 Voyage AI 模型的更广泛采用。

接下来，我们会将 Voyage AI 的功能无缝嵌入到 MongoDB Atlas 中，首先推出用于 Vector Search 的自动嵌入服务，该服务将自动处理嵌入生成。然后将进行原生重新排序，使开发人员能够立即提高检索准确性。我们还计划扩展特定领域的 AI 功能，以更好地支持不同行业（例如，金融服务、法律等）或用例（例如，代码生成）。

最后，我们将通过增强的多模态功能推进 AI 驱动的检索，实现文本、图像和视频的无缝检索和排序。我们还计划引入指令调整模型，允许开发者使用简单的提示而不是复杂的微调来优化搜索行为。这将通过在 MongoDB Atlas 中嵌入生命周期管理来实现补充，确保 AI 应用的持续更新和实时优化。

这对开发者和企业意味着什么？

AI 驱动的应用需要的不仅仅是一个存储、处理和持久化数据的数据库，而是还需要一个能够主动提高检索准确性、无缝扩展并消除操作摩擦的数据库。借助 Voyage AI，MongoDB 重新定义了支撑任务关键型 AI 驱动的应用的数据库要求。

开发者将不再需要管理外部嵌入 API、独立运行的实例向量存储或复杂的搜索管道。AI 检索将内建到数据库中，实现与传统查询一样的无缝语义搜索、向量检索和排序。

对于企业来说，这意味着能够更加信心十足地扩展 AI 应用，并加快价值实现速度。通过在大规模扩展交付高质量的结果，企业可以将 AI 无缝集成到其最关键的用例中，确保可靠性、性能和实际影响。

展望未来：接下来会发生什么

这仅仅是个开始。我们的愿景是将 MongoDB 打造成最强大且直观的数据库，适用于现代 AI 驱动的应用程序。

Voyage AI 的模型将很快在 MongoDB Atlas 中原生可用。
我们将继续提升 MongoDB 的 AI 检索能力，使其更智能、更具适应性，并能够处理更广泛的数据类型和应用场景。

请继续关注最新动态，详细了解如何在 MongoDB 中开始使用 Voyage AI 功能。

要了解更多关于 MongoDB 和 Voyage AI 如何为构建、扩展和部署智能应用提供最先进的 AI 搜索和检索功能的信息，请访问我们的产品页面。

← Previous

MongoDB & DKatalis’s Bank Jago, Empowering Over 500 Engineers

DKatalis , a technology company specialized in developing scalable digital solutions, is the engineering arm behind Bank Jago , Indonesia’s first digital bank. An app-only institution, Bank Jago enables end-to-end banking with features such as auto budgeting. This allows Bank Jago’s customers to easily and effectively organize their finances by creating " Pockets "—for expenses like food, savings, or entertainment. Launched in 2019, Bank Jago has seen tremendous growth in only a few years, with its customer base reaching 14.1 million as of October 2024. While speaking at MongoDB.local Jakarta , Chris Samuel, Staff Engineer at DKatalis, shared how MongoDB became the data backbone of Bank Jago, and how MongoDB Atlas supported Bank Jago’s growth. Bank Jago’s journey with MongoDB started in 2019, when DKatalis built the first version of Bank Jago using the on-premise version of MongoDB: MongoDB Community Edition . “We did everything ourselves, up to the point when we realized that the bigger our user [base] grew, the more painful it was for us to monitor everything,” said Samuel. In 2021, DKatalis decided to migrate Bank Jago [from MongoDB Community Edition] to MongoDB Atlas. This first involved migrating all data to Atlas. Then the database platform had to be set up to facilitate scalability and enable improved maintenance operations in the long-term. “In terms of process, it is actually seamless,” said Samuel during his MongoDB.local talk. Specifically, MongoDB Atlas offers six key capabilities that have facilitated the bank’s daily operations, supported its fast growth, and improved efficiencies: Flexibility: MongoDB's document model supports diverse data types and adapts to Jago's dynamic requirements. Scalability: MongoDB Atlas effortlessly supports the rapid growth in user base and data volume. High performance: The platform enables fast query execution and efficient data retrieval for a seamless customer experience. Real-time capabilities: MongoDB Atlas prevents delays during transactions, account creation, and balance checking. Regulation compliance: With MongoDB Atlas, local hosting is possible. This enables DKatalis to meet Indonesian financial regulatory standards. Community support: MongoDB’s strong developer community and rich ecosystem in Jakarta fosters collaboration and learning. All of these have also helped improve efficiencies for DKatalis’s team of over 500 engineers, who are now able to reduce data architecture complexity, and focus on innovation. Fostering a great engineering culture and community with MongoDB In another talk at MongoDB.local Singapore , DKatalis’s Chief Engineering Officer, Alex Titlyanov, explained that using MongoDB has been and continues to be a great learning, upskilling, and operational experience for his team. “DKatalis has a pretty unique organizational culture when it comes to its engineering teams: there are no designated engineering managers or project managers; instead, teams are self-managed,” said Titlyanov. “This encourages a community-driven environment, where engineers are continuously upgrading their skills, particularly with tools like MongoDB.” The company has established internal communities, such as the MongoDB community led by Principal Software Engineer Boon Hian Tek. These communities focus on knowledge sharing, skill-building, and ensuring that the company’s 500 engineers are proficient in using MongoDB. This deep knowledge of MongoDB—and the ease of use offered by the Atlas platform—means that DKatalis’s engineers are also able to build their own bespoke tools to improve daily operations and meet specific needs. For example, the team has built a range of tools aimed at helping deal with the complexity and scale of Bank Jago’s data architecture. “Most traditional banks offer their customers access to six months, sometimes a year’s worth of transaction history. But Bank Jago gives access to the entire transaction history,” said Boon. The engineering team ended up having to deal with 56 different databases and 485 data collections. Some would reach 1.13 billion documents, while others receive up to 42.5 million new documents every day. Some of the bespoke tools built on MongoDB Atlas include: Index sync report: DKatalis implemented a custom-built tool using MongoDB’s Atlas API to manage database indexing automatically. This was essential given the bank’s real-time requirements. Adding indexes manually during peak hours would have disrupted performance. Daily reporting: The team built a tool to monitor for slow queries. This provides daily reports on query performance so issues can be identified and resolved quickly. Add index: The Rolling Index feature from Atlas was initially used. However, the team required greater context for each index. Therefore, they built a tool that at 3:00 am automatically checks if there are any indexes to create. The tool calls in the Atlas API to create and publish the results. Exporting metrics: The Atlas console was used to source diagrams that were helpful. However, the team required each metric to be available per database and per collection versus cluster. The team built a thin layer on top of the Atlas console to slice up the required metrics using the Atlas API. “The scalability and flexibility of MongoDB have been essential in helping the team handle the bank’s fast growth and complex feature set. MongoDB’s document-oriented structure enables us to develop innovative features like ‘Pockets’, and we continue to see MongoDB as an integral part of our technology stack in the future,” said Titlyanov. Visit our product page to learn more about MongoDB Atlas . To learn how MongoDB powers solutions in the financial services industry, visit our solutions page .

February 24, 2025

Next →

Don’t Just Build Agents, Build Memory-Augmented AI Agents

Insight Breakdown: This piece aims to reveal that regardless of architectural approach—whether Anthropic's multi-agent coordination or Cognition's single-threaded consolidation—sophisticated memory management emerges as the fundamental determinant of agent reliability, believability, and capability. It marks the evolution from stateless AI applications toward truly intelligent, memory-augmented systems that learn and adapt over time. AI agents are intelligent computational systems that can perceive their environment, make informed decisions, use tools, and, in some cases, maintain persistent memory across interactions—evolving beyond stateless chatbots toward autonomous action. Multi-agent systems coordinate multiple specialized agents to tackle complex tasks, like a research team where different agents handle searching, fact-checking, citations and research synthesis. Recently, two major players in the AI space released different perspectives on how to build these systems. Anthropic released an insightful piece highlighting their learnings on building multi-agent systems for deep research use cases. Cognition also released a post titled: " Don't Build Multi-Agents ," which appears to contradict Anthropic's approach directly. Two things stand out: Both pieces are right Yes, this sounds contradictory, but working with customers building agents of all scales and sizes in production, we find that both the use case and application mode, in particular, are key factors to consider when determining how to architect your agent(s). Anthropic's multi-agent approach makes sense for deep research scenarios where sustained, comprehensive analysis across multiple domains over extended periods is required. Cognition's single-agent approach is optimal for conversational agents or coding tasks where consistency and coherent decision-making are paramount. The application mode—whether research assistant, conversational agent, or coding assistant—fundamentally shapes the optimal memory architecture. Anthropic also highlights this point when discussing the downside of multi-agent architecture. For instance, most coding tasks involve fewer truly parallelizable tasks than research, and LLM agents are not yet great at coordinating and delegating to other agents in real time. Anthropic, Building Multi-Agent Research System Both pieces are saying the same thing Memory is the foundational challenge that determines agent reliability, believability, and capability. Anthropic emphasizes sophisticated memory management techniques (compression, external storage, context handoffs) for multi-agent coordination. Cognition emphasizes context engineering and continuous memory flow to prevent the fragmentation that destroys agent reliability. Both teams arrived at the same core insight: agents fail without robust memory management . Anthropic chose to solve memory distribution across multiple agents, while Cognition chose to solve memory consolidation within single agents. The key takeaway from both pieces for AI Engineers or anyone developing an agentic platform is not just build agents, build Memory Augmented AI agents . With that out of the way, the rest of this piece will provide you with the essential insights from both pieces that we think are important and point to the memory management principles and design patterns we’ve observed in our customers’ building agents. The key insights If you are building your agentic platform from scratch, you can extract much value from Anthropic's approach to building multi-agent systems, particularly their sophisticated memory management principles, which are essential for effective agentic systems. Their implementation reveals critical design considerations, including techniques to overcome context window limitations through compression, function calling, and storage functions that enable sustained reasoning across extended multi-agent interactions—foundational elements that any serious agentic platform must address from the architecture phase. Key insights: Agents are overthinkers Multi-agent systems trade efficiency for capability Systematic agent observation reveals failure patterns Context windows remain insufficient for extended sessions Context compression enables distributed memory management Let's go a bit deeper into how these insights translate into practical implementation strategies. Agents are overthinkers Anthropic researchers mentioned using explicit guidelines to steer agents into allocating the right amount of resources (tool calls, sub-agent creation, etc.), or else, they tend to overengineer solutions. Without proper constraints, the agents would spawn excessive subagents for simple queries, conduct endless searches for nonexistent information, and apply complex multi-step processes to tasks requiring straightforward responses. Explicit guidance for agent behavior isn't entirely new—system prompts and instructions are typical parameters in most agent frameworks. However, the key insight here goes deeper than traditional prompting approaches. When agents are given access to resources such as data, tools, and the ability to create sub-agents, there needs to be explicit, unambiguous direction on how these resources are expected to be leveraged to address specific tasks. This goes beyond system prompts and instructions into resource allocation guidance, operational constraints, and decision-making boundaries that prevent agents from overengineering solutions or misusing available capabilities. Take, for example, the OpenAI Agent SDK with several parameters to describe behaviours of resources to the agent, such as handoff_description , which will be utilized in a multi-agent system built with the OpenAI SDK. This argument specifies how the subagent should be leveraged in a multi-agent system. Or the explicit argument tool_use_behavior that describes to the agent how a tool should be used, as the name suggests. The key takeaway for AI Engineers is that multi-agent system implementation requires an extensive thinking process that involves what tools the agents are expected to leverage, the subagents in the system, and how resource utilization is communicated to the calling agent in a multi-agent system. When implementing resource allocation constraints for your agents, consider traditional approaches of managing multiple specialized databases (vector DB for embeddings, graph DB for relationships, relational DB for structured data) compound the complexity problem, and introduce tech stack sprawl, an anti-pattern to rapid AI innovation. Multi-agent systems trade efficiency for capability While multi-agent architectures can utilize more tokens and parallel processing for complex tasks, Anthropic found operational costs significantly higher due to coordination overhead, context management, and the computational expense of maintaining a coherent state across multiple agents. In some cases, two heads are better than one, but they are also expensive within multi-agent systems. One thing we note here is that the use case used in Anthropic's multi-agent system is deep research. This use case requires extensive exploration of resources, including heavily worded research papers, sites, and documentation, to accumulate enough information to formulate the result of this use case (which is typically a 2000+ word essay on the user’s starting prompt). In other use cases, such as automated workflow with agents representing processes within the workflow, there might not be as much token consumption, especially if the process encapsulates deterministic steps such as database reads and write operations, and its output is execution results that are sentences or short summaries. The coordination overhead challenge becomes particularly acute when agents need to share state across different storage systems. Rather than managing complex data synchronization between specialized databases, MongoDB's native ACID compliance ensures that multi-agent handoffs maintain data integrity without external coordination mechanisms. This unified approach reduces both the computational overhead of distributed state management and the engineering complexity of maintaining consistency across multiple storage systems. Context compression enables distributed memory management Beyond reducing inference costs, compression techniques allow multi-agent systems to maintain shared context across distributed agents. Anthropic's approach involves summarizing completed work phases and storing essential information in external memory before agents transition to new tasks. This, coupled with the insight that Context windows remain insufficient for extended sessions, points to the fact that prompt compression or compaction techniques are still relevant and useful in a world where LLMs have extensive context windows. Even with a 200K token (approximately 150,000 words) capacity, Anthropic’s agents in multi-round conversations require sophisticated context management strategies, including compression, external memory offloading, and spawning fresh agents when limits are reached. We previously partnered with Andrew Ng and DeepLearning AI on a course on prompt compression techniques and retrieval-augmented generation (RAG) optimization. Systematic agent observation reveals failure patterns Systematic agent observation represents one of Anthropic's most practical insights. Essentially, rather than relying on guesswork (or vibes), the team built detailed simulations using identical production prompts and tools and then systematically observed step-by-step execution to identify specific failure modes. This phase in an agentic system has an extensive operational cost. From our perspective, working with customers building agents in production, this methodology addresses a critical gap most teams face: understanding how your agents actually behave versus how you think they should behave . Anthropic's approach immediately revealed concrete failure patterns that many of us have encountered but struggled to diagnose systematically. Their observations uncovered agents overthinking simple tasks, like we mentioned earlier, using verbose search queries that reduced effectiveness, and selecting inappropriate tools for specific contexts. As they note in their piece: " This immediately revealed failure modes: agents continuing when they already had sufficient results, using overly verbose search queries, or selecting incorrect tools. Effective prompting relies on developing an accurate mental model of the agent. " The key insight here is moving beyond trial-and-error prompt engineering toward purposeful debugging . Instead of making assumptions about what should work, Anthropic demonstrates the value of systematic behavioral observation to identify the root causes of poor performance. This enables targeted prompt improvements based on actual evidence rather than intuition. We find that gathering, tracking, and storing agent process memory serves a dual critical purpose: not only is it vital for agent context and task performance, but it also provides engineers with the essential data needed to evolve and maintain agentic systems over time. Agent memory and behavioral logging remain the most reliable method for understanding system behavior patterns, debugging failures, and optimizing performance, regardless of whether you implement a single comprehensive agent or a system of specialized subagents collaborating to solve problems. MongoDB's flexible document model naturally accommodates the diverse logging requirements for both operational memory and engineering observability within a single, queryable system. One key piece that would be interesting to know from the Anthropic research team is what evaluation metrics they use. We’ve spoken extensively about evaluating LLMs in RAG pipelines, but what new agentic system evaluation metrics are developers working towards? We are answering these questions ourselves and have partnered with Galileo, a key player in the AI Stack, whose focus is purely on evaluating RAG and Agentic applications and making these systems reliable for production. Our learning will be shared in this upcoming webinar , taking place on July 17, 2025. However, for anyone building agentic systems, this represents a shift in development methodology—building agents requires building the infrastructure to understand them, and sandbox environments might become a key component of the evaluation and observability stack for Agents. Advanced implementation patterns Beyond the aforementioned core insights, Anthropic's research reveals several advanced patterns worth examining: The Anthropic piece hints at the implementation of advanced retrieval mechanisms that go beyond vector-based similarity between query vectors and stored information. Their multi-agent architecture enables sub-agents to call tools (an approach also seen in MemGPT ) to store their work in external systems, then pass lightweight references—presumably unique identification numbers of summarized memory components—back to the coordinator. We generally emphasize the importance of the multi-model retrieval approach to our customers and developers, where hybrid approaches combine multiple retrieval methods—using vector search to understand intent while simultaneously performing text search for specific product details. MongoDB's native support for vector similarity search and traditional indexing within a single system eliminates the need for complex reference management across multiple databases, simplifying the coordination mechanisms that Anthropic's multi-agent architecture requires. The Anthropic team implements continuity in the agent execution process by establishing clear boundaries between task completion and summarizing the current phase before moving to the next task. This creates a scalable system where memory constraints don't bottleneck the research process, allowing for truly deep and comprehensive analysis that spans beyond what any single context window could accommodate. In a multi-agent pipeline, each sub-agent produces partial results—intermediate summaries, tool outputs, extracted facts—and then hands them off into a shared “memory” database. Downstream agents will then read those entries, append their analyses, and write updated records back. Because these handoffs happen in parallel, you must ensure that one agent’s commit doesn’t overwrite another’s work or that a reader doesn’t pick up a half-written summary. Without atomic transactions and isolation guarantees, you risk: Lost updates , where two agents load the same document, independently modify it, and then write back, silently discarding one agent’s changes. Dirty or non-repeatable reads , where an agent reads another’s uncommitted or rolled-back write, leading to decisions based on phantom data. To coordinate these handoffs purely in application code would force you to build locking layers or distributed consensus, quickly becoming a brittle, error-prone web of external orchestrators. Instead, you want your database to provide those guarantees natively so that each read-modify-write cycle appears to execute in isolation and either fully succeeds or fully rolls back. MongoDB's ACID compliance becomes crucial here, ensuring that these boundary transitions maintain data integrity across multi-agent operations without requiring external coordination mechanisms that could introduce failure points. Application mode is crucial when discussing memory implementation . In Anthropic's case, the application functions as a research assistant, while in other implementations, like Cognition's approach, the application mode is conversational. This distinction significantly influences how agents operate and manage memory based on their specific application contexts. Through our internal work and customer engagements, we extend this insight to suggest that application mode affects not only agent architecture choices but also the distinct memory types used in the architecture. AI agents need augmented memory Anthropic’s research makes one thing abundantly clear: context window is not all you need. This extends to the key point that memory and agent engineering are two sides of the same coin. Reliable, believable, and truly capable agents depend on robust, persistent memory systems that can store, retrieve, and update knowledge over long, complex workflows. As the AI ecosystem continues to innovate on memory mechanisms, mastering sophisticated context and memory management approaches will be the key differentiator for the next generation of successful agentic applications. Looking ahead, we see “Memory Engineering” or “Memory Management” emerge as a key specialization within AI Engineering, focused on building the foundational infrastructure that lets agents remember, reason, and collaborate at scale. For hands-on guidance on memory management, check out our webinar on YouTube, which covers essential concepts and proven techniques for building memory-augmented agents. Head over to the MongoDB AI Learning Hub to learn how to build and deploy AI applications with MongoDB.

July 9, 2025