If an intelligent system hallucinates, relies on stale information, or struggles with memory in longer conversations, then Retrieval Augmented Generation (RAG) can help. But for high-stakes tasks involving complex data and nuance, RAG needs access to structured domain knowledge, using a method called hybrid RAG.
What is RAG?
Retrieval Augmented Generation is an AI framework that enhances large language models (LLMs) by connecting them to external data sources. It has two working parts. The first part is a retriever that fetches relevant content from a knowledge base. They might be documents or vector stores.
The second part is a generator that uses the retrieved information to generate more accurate, grounded responses. In combination, these two parts of RAG can make an AI model’s outputs more useful and trustworthy, raising its domain intelligence.
What is hybrid RAG?
Hybrid RAG is a more advanced version of RAG that uses two kinds of search. First, it uses semantic search which is based on meaning (known as dense search). It also uses keywords (sparse search) which looks for specific words. Combining these two kinds of search makes it less likely that important details will be missed.
Another feature of hybrid RAG is its ability to use both structured data, such as ontologies, knowledge graphs and relational databases; and unstructured data such as pdfs and websites. These are merged or ranked, then passed on to the LLM to improve recall, accuracy, and explainability, especially in specialised domains known as verticals.
What is RAG good at?
RAG is well-suited to chatbots, research tools, and customer support – among other things. That’s because, firstly, it excels at pulling in relevant external information, while providing sources. This enriches conversations and makes them more trustworthy.
Memory management is the other reason for its popularity. RAG treats previous conversations as part of its memory or knowledge base. This enables RAG to retain conversation context for longer, in a way that LLMs cannot always be relied on to do.
What are hybrid RAG’s drawbacks?
Hybrid RAG makes LLMs more reliable in some ways, but it can struggle at scale and is harder to maintain, requiring both technical and domain expertise (knowledge engineering). Accessing data from different systems can also present security and privacy risks.
These issues mostly all stem from hybrid RAG’s additional complexity. Querying large knowledge graphs and high-dimensional vector databases take time to generate results. It demands high computational power, memory, and storage, especially when the system hasn’t been well optimised. Multimodal RAG – which handles both text, tables and images – takes this complexity to an even higher level.
Optimisation does mitigate these problems however. Hybrid RAG’s speed improves significantly when metadata is applied to updated documents, for example. A 2025 study provides actionable best practices for implementing hybrid RAG systems.
Why do ontologies matter for hybrid RAG?
An ontology is basically a map and rulebook of a subject area. It sets out key terms and how they relate to each other so that AI can interpret and process a domain’s knowledge more accurately. Hybrid RAG can reference these knowledge frameworks and, if customised, use them in a rules-based inference loop. This enables the application of constraints and boundaries to guide what an AI agent can do, decide or use.
With dynamic access to ontologies, hybrid RAG can improve an AI model’s outputs and reduce the chances of hallucination. However, this all depends on the quality of the data within the ontology.
How do I build an ontology for hybrid RAG?
Modern ontology platforms aim to let domain experts develop a working ontology without the need for technical training or hiring an ontology engineer. They provide a code-free process for defining terms, setting up relationships, and organising the structure. In larger organisations or communities, consensus building methods are often used to agree the terminology and get a working ontology over the line.
A good starting-point is to ask your experts three key domain questions.