📄 Our paper is out: Optimizing Knowledge Graph-LLM Interface
🚀 Cognee SaaS: Sign up for the hosted beta!
Blog>Deep Dives

The Art of Intelligent Retrieval: Unlocking the Power of Search

Ever asked your preferred AI engine a brilliant question, only to get a well-phrased yet pretty useless answer right back? Congrats, you are one of the many, many people dealing with the problem cognee was designed to solve.

In a world drowning in data, we're building the lifeline: smarter retrieval. Okay, okay… It's clearly not quite that dramatic, but—in knowledge-graph-based AI systems, underwhelming retrievals are a pivotal problem we're excited to be taking head-on.

In this post, we will show how we are tackling it from different angles using different retrievers. While we've explored vector databases and how they enable semantic search, and we've looked at how GraphRAG supercharges search with knowledge graphs, we haven't yet pulled back the curtain on the actual retrievers.

That's what we'll do right here, right now. Here's what's ahead:

  1. Preliminaries — A bit of a catch-up so we're all on the same page
  2. A Layered Approach to Search — How cognee's retrieval system is structured
  3. The Retriever Gallery: Specialized Search Mechanisms — A walkthrough of each retriever and what it's for
  4. What Kind of Retriever Are You? — A closing reflection... with a personality-identifying quiz.

Yes, you read that right. For those who stick around until the end, we've included a fun quiz to help you discover which Cognee retriever fits your style. (Are you a structured GraphCompletionRetriever or a free-flowing NaturalLanguageRetriever? You'll soon find out!)

Preliminaries: Cognee's Knowledge Architecture

Before we dive into retrievers, let's quickly recap the core components that power cognee's retrieval system. If you're already familiar with these concepts, feel free to skip ahead.

Tools & Platforms

  • cognee: That's us. Hi there! We've built cognee—an open-source AI memory framework designed to help applications understand, structure, and retrieve knowledge more effectively. cognee ingests raw data and transforms it into a dynamic, searchable knowledge graph, combining the strengths of vector and graph search to enable flexible, context-aware retrieval.
  • Graph Store: The database system (Neo4j, Memgraph, or NetworkX) that maintains cognee's knowledge graph structure, storing nodes, edges, and their properties for relationship-based queries.
  • Vector Store: The database system that indexes and stores vector embeddings for text elements (chunks, summaries, entities) and facilitates powerful search based on semantic similarity.

Core Concepts

  • Knowledge Graph: A structured representation of information as interconnected entities and relationships. In cognee, KGs form the backbone of contextual understanding.
  • Nodes: The points in cognee's knowledge graph that represent entities, documents, chunks, or other discrete data points. Each node contains properties that describe it.
  • Edges: The connections between nodes that represent relationships like "mentions," "is_related_to," or "contains." Edges define how data entities relate to each other.

cognee's Building Blocks

  • Documents: Raw text files, PDFs, web pages, and other content sources that cognee ingests, processes, and exposes for efficient retrieval.
  • Chunks: Self-contained fragments of text extracted from documents during ingestion. Typically paragraph-sized, documents are ingested chunk-by-chunk, and the data extracted is embedded and stored in graph and vector databases.
  • Entities: Named objects, concepts, people, places, or things identified within documents/chunks. These become nodes in cognee's knowledge graphs and can be connected to other entities or document chunks.
  • Summaries: Some parts of the graph are strategically condensed, and turned into summaries, which are then turned into new nodes of the graph.

If you've read the preliminaries, you're all set. If you skipped ahead—no worries. This is where we start to see how it all works in practice.

Once your data is ingested using cognee.cognify(), it becomes searchable. From there, cognee.search() is your main interface. You can ask questions and get back relevant answers—whether you need summaries, direct excerpts, graph traversals, or natural language responses. If you haven't already, check out the quickstart example to see the full pipeline in action.

The cognee.search() function supports multiple search types, which define how your question gets answered. You don't need to worry about the mechanics—just specify a SearchType, and cognee handles the rest.

Under the hood, each search type is powered by a different retriever—a component that does the actual work of fetching, scoring, and formatting the result. Despite their differences, all retrievers follow the same structure, built around a shared BaseRetriever class.

This base class defines a simple two-step process:

  1. Gather context
  2. Return or generate an answer

This structure keeps the system modular, transparent, and easy to extend. It also makes debugging and custom development far more approachable.

Next, we'll break down how the search() function works, take a closer look at BaseRetriever, and then get to know the individual retrievers and see how each one solves a different type of problem.

1. The Search API: Your One-Line Lookup

cognee.search() is dead simple to use: just pass in a question, pick a SearchType, and get back relevant, structured results—no extra setup required.

Here's a basic example:

The query_text is your question, and the SearchType tells cognee how to approach it. You can optionally specify a user or a list of datasets—but if you don't, cognee will fall back on sensible defaults.

Behind the scenes, search() handles quite a bit:

  • Prepares and validates your inputs
  • Selects the appropriate retriever for the given search type
  • Runs the retriever to gather context and generate an answer
  • Filters results based on user permissions

You don't need to worry about how graph queries or vector lookups are executed—cognee keeps the API simple while quietly handling all the complexity. In the next section, we'll take a look at how that complexity is structured.

2. Shared Structure for All Retrievers: BaseRetriever

Every retriever in Cognee follows the same core pattern. This pattern is defined by a shared interface called BaseRetriever, located in cognee/modules/retrieval/base_retriever.py.

The BaseRetriever is an abstract base class. It doesn't include any shared logic or data. Instead, it defines the two method signatures that every retriever must implement:

2. Shared Structure for All Retrievers: BaseRetriever

Every retriever in Cognee follows the same core pattern, defined by a common interface called BaseRetriever—you'll find it in cognee/modules/retrieval/base_retriever.py.

BaseRetriever is an abstract base class. It doesn't include shared logic or state—instead, it defines the contract: two method signatures that every retriever must implement.

This structure relies on two key processes:

  1. get_context(query)

    This method fetches the raw context needed to answer the query. That might be a list of relevant text chunks, a subgraph of related entities, or any other structure the retriever relies on. Each retriever decides how to gather and represent this internally.

  2. get_completion(query, context)

    This method takes that context and turns it into a final answer. Some retrievers call an LLM here to generate a natural language response. Others might return the raw context as-is. It all depends on the search type.

Together, these two steps give every retriever a clean and consistent structure: one method gathers information, the other makes it useful.

This simplicity is what keeps cognee's search API so lightweight. When you call cognee.search(...), the system automatically selects the right retriever and runs:

If no context is provided, most retrievers will call get_context() internally. That's how cognee.search() stays a one-liner on the user's end—even if very different things are happening beneath the surface.

This two-method contract makes the system easy to reason about and easy to extend. Whether you're building a retriever for graph traversal, semantic lookup, code-aware search, or something else, it all follows the same pattern.

Now that we've covered the overall search architecture and the BaseRetriever structure, let's look at the actual retrievers in action. Each one is designed to handle a specific type of search—some focus on matching short text spans, others retrieve summaries, and some traverse the graph or call LLMs to generate rich, contextual answers.

In this section, we'll walk through each retriever one by one. For each, we'll explain:

  • What it's built for
  • How it gathers context
  • What kind of output it returns
  • When to use it in your application

A. SummariesRetriever: High-Level Document Overviews

Sometimes you don't need all the details—you just want to understand what a document is about. That's exactly what SummariesRetriever does.

It performs a vector search over a special summary index called TextSummary_text, which stores short, pre-written summaries for each document or chunk. When you query, cognee compares your prompt to these summaries and returns the closest matches.

There's no LLM involved—this is just a fast, lightweight lookup that returns summaries as-is. That makes it ideal when you're scanning through content and want a quick, high-level overview of it.

Use it when:

You want a fast way to explore relevant documents without reading the entire text.

B. ChunksRetriever: Direct Access to Source Text

If SummariesRetriever gives you the big picture, ChunksRetriever gives you the nitty gritty.

This retriever searches through the DocumentChunk_text index, which contains short, paragraph-sized fragments—aka chunks—extracted during ingestion. Using vector similarity, it finds the most relevant chunks based on your query and returns them as-is.

No LLMs, no rewriting—just raw, original text pulled straight from your documents. That makes it fast, transparent, and ideal when you want to see exactly what was written.

Use it when:

You need to read the actual content from the source, not a generated summary or rephrasing.

C. CompletionRetriever: RAG-Powered Answers

CompletionRetriever follows the classic Retrieval-Augmented Generation (RAG) approach: it retrieves relevant context, then feeds it into a language model to generate a natural-language answer. This makes it useful for responses that go beyond raw data—something closer to an actual explanation.

This retriever searches the DocumentChunk_text index for the top-matching chunks, combines them into a single context block, and passes that (along with your query) to an LLM. Prompt templates help guide the model's response, keeping it focused, coherent, and readable.

You can control how much context it pulls in using the top_k parameter. Prompt templates are fully customizable, though cognee provides solid defaults out of the box.

Use it when:

You want a clear, natural-language answer based on trusted source material—perfect for building helpful, context-aware assistants.

D. GraphCompletionRetriever: Reasoning Over the Knowledge Graph

GraphCompletionRetriever is built for questions that benefit from both content and structure. Instead of searching the full graph or running some complex logic, it simply pulls in relationships from the knowledge graph, providing a focused view of how different parts of the data connect.

It begins with a vector search across chunks, summaries, and entity nodes. From the top results, it identifies a relevant set of graph edges—effectively building a small subgraph related to your query.

This subgraph is then converted into plain text: each node gets a short description, and connections between nodes are clearly laid out. That structured context is passed to the LLM, which uses it to generate a coherent, context-aware answer.

Use it when:

You want a natural-language response that draws on both structured relationships and textual content—ideal for questions that depend on understanding how concepts relate.

E. GraphSummaryCompletionRetriever: Concise Answers from Complex Graphs

This retriever builds on the GraphCompletionRetriever but adds an extra layer of refinement: summarization.

When your query touches a wide or deeply connected part of the knowledge graph, the raw context can become noisy or repetitive. Instead of sending all of this to the LLM, cognee first summarizes the graph context—condensing nodes, edges, and supporting content into a shorter, more digestible form. That summary is then passed to the model to generate the final answer.

The result? A concise, natural-language response that still reflects the full depth of the data—without overwhelming the LLM or the user.

Use it when:

Your question spans many connections or concepts, and you want a high-level, structured answer that cuts through the noise.

F. GraphCompletionContextExtensionRetriever: Building a Broader Picture

This retriever starts like GraphCompletionRetriever, but iteratively pushes further. Instead of stopping at the first set of relevant graph connections, it keeps extending its view to build a more complete understanding.

After gathering the initial graph context, it prompts the LLM to suggest where to look next. That suggestion becomes a new search query. This process is repeated a few times, gradually expanding the subgraph and enriching the context with each round.

Once the broader context is assembled, the final question is posed to the LLM, which generates a well-informed answer based on the extended graph.

Use it when:

You expect the answer to require layered or indirect context. Start with GraphCompletionRetriever, and switch to this when you need deeper exploration and broader coverage.

G. GraphCompletionCotRetriever: Reasoning in Steps

This retriever builds on GraphCompletionRetriever, but introduces a chain-of-thought (CoT) approach to answer generation. Instead of producing a response in a single pass, it reasons iteratively—building context, questioning assumptions, and refining its output as it goes.

Like GraphCompletionContextExtensionRetriever, it expands the graph context over multiple steps, but with a focus on reflective reasoning. After generating an initial answer, the retriever pauses, evaluates whether the response holds up, and prompts the model to suggest a follow-up question. That follow-up triggers a new round of context gathering and response refinement.

This loop continues for several rounds, allowing the system to explore more nuanced connections and clarify its own reasoning before delivering the final answer.

Use it when:

Your question involves multiple layers of logic or requires step-by-step reasoning across different parts of the graph. Perfect for complex queries where a single pass might miss the bigger picture.

H. CypherSearchRetriever: Direct Graph Querying

This retriever gives you full, low-level control over the knowledge graph. Instead of relying on embeddings, vector search, or prompt-based reasoning, it lets you write raw Cypher queries and execute them directly against the database.

You write the query, cognee runs it, and returns the raw results—no LLMs, no vector search, no interpretation layer, and no abstraction in between.

To use this retriever, your graph backend must support Cypher (e.g., Neo4j). It won't work with in-memory graphs like NetworkX.

Use it when:

You know Cypher and want precise, hands-on control over what gets queried and how. Ideal for advanced users who need full transparency and specificity.

I. NaturalLanguageRetriever: Ask in English, Get Cypher Answers

If CypherSearchRetriever is for graph power users, this one is for everyone else. NaturalLanguageRetriever lets you ask questions in plain English. Under the hood, it actually translates your query into Cypher, runs it against the graph, and returns the results.

It uses an LLM to interpret your question, reference the graph schema, and generate a valid Cypher query. If the first attempt doesn't return results, it automatically retries with a refined query.

Like the Cypher retriever, this one requires a graph backend that supports Cypher (e.g., Neo4j). It's not compatible with in-memory graph stores like NetworkX.

Use it when:

You want to explore your graph and get precise answers—without having to write Cypher by hand.

Well Done—You're Practically a Retriever Now!

You've made it through summaries, chunks, graphs, completions, and even a bit of step-by-step reasoning. At this point, you probably understand the cognee search system better than some of the retrievers do.

As you've likely realized, what makes the cognee machinery tick is its layered, modular approach. Instead of relying on a single type of search, it offers many—each with its own role, but all following the same simple structure. The result of that consistency? A powerful, intuitive, and easy to extend system.

From fast lookups to multi-hop graph walks, you can shift search strategies without changing how you ask. And we're not done here—new retrievers are already in testing, with more still sketched out on whiteboards. As the system evolves, so will your options.

For the end, we've prepared a special treat (pun intended)—take a quick scroll (get it? stroll 🤭) down below and get to know your inner retriever.

Quiz — Which cognee Retriever Are You?

Quiz — Which cognee Retriever Are You?

Q1: When you open the fridge at 3 a.m., you…
Q2: The night before an exam, you…
Q3: At the school karaoke show, you…
Q4: If zombies showed up tomorrow, you…

From the blog