Cloudera, Inc. and Pinecone have announced a strategic partnership that integrates Pinecone’s artificial intelligence (AI) vector database expertise into Cloudera’s open data platform, aimed at transforming the way organisations harness the power of AI to streamline operations and improve customer experiences.

Pinecone’s vector database is critical infrastructure for generative AI (GenAI). Pinecone is optimised to store AI representations of data (vector embeddings) and search through them by semantic similarity, something traditional databases are very inefficient at doing. This capability is necessary for adding context to queries against applications that use large language models (LLMs). That added context significantly cuts down on erroneous outputs, often referred to as hallucinations, helping search and GenAI applications deliver responses that are accurate and relevant.

The partnership will see Cloudera integrate Pinecone’s best-in-class vector database into Cloudera data platform (CDP), enabling organisations to more easily build and deploy highly scalable, real-time, AI-powered applications on Cloudera. This includes the release of a new applied machine learning (ML) prototype (AMP) that will allow developers to more quickly create and augment new knowledge bases from data on their own website, as well as pre-built connectors that will enable customers to more quickly set up ingest pipelines in AI applications. In the AMP, Pinceone’s vector database uses these knowledge bases to imbue context into chatbot responses, helping to ensure useful outputs.

Customers can use this same architecture to set up or improve support chatbots or internal support search systems. This enables them to reduce operational costs by decreasing expensive human case-handling efforts and improving the customer experience with faster resolution times.

Commenting on the partnership, Elan Dekel, vice president, product, Pinecone, said, “Cloudera’s extensive expertise in data management combined with Pinecone’s cutting-edge vector database creates a formidable partnership. A lot of our customers already manage their data with Cloudera. Now it will be easier than ever for them to build AI applications using their embeddings stored with us and data stored with Cloudera. Together we will enable organisations to deliver unparalleled personalised experiences, drive user engagement, and achieve business success.”

Meanwhile, Abhas Ricky, chief strategy officer, Cloudera, said, “We are excited to bring the power of Pinecone vector database and semantic search capabilities to our public cloud customers to accelerate GenAI use cases, and significantly improve the developer experience at scale.”

Further, Sanjeev Mohan, founder, SanjMo and former analyst, Gartner, said, “Integration of Pinecone with CDP adds a very critical new functionality that will help clients build GenAI applications. In addition, the planned integration between the open-source Apache Ni-Fi-based Cloudera data flow (CDF) and Pinecone further bolsters CDP’s emphasis on universal data distribution for AI. CDP customers can bring AI to where their data resides, on-premises, in the cloud or on the edge.”