The Art of Intelligent Retrieval: Unlocking the Power of Search

Ever asked your preferred AI engine a brilliant question, only to get a well-phrased yet pretty useless answer right back? Congrats, you are one of the many, many people dealing with the problem cognee was designed to solve.

In a world drowning in data, we're building the lifeline: smarter retrieval. Okay, okay… It's clearly not quite that dramatic, but—in knowledge-graph-based AI systems, underwhelming retrievals are a pivotal problem we're excited to be taking head-on.

In this post, we will show how we are tackling it from different angles using different retrievers. While we've explored vector databases and how they enable semantic search, and we've looked at how GraphRAG supercharges search with knowledge graphs, we haven't yet pulled back the curtain on the actual retrievers.

That's what we'll do right here, right now. Here's what's ahead:

Preliminaries — A bit of a catch-up so we're all on the same page
A Layered Approach to Search — How cognee's retrieval system is structured
The Retriever Gallery: Specialized Search Mechanisms — A walkthrough of each retriever and what it's for
What Kind of Retriever Are You? — A closing reflection... with a personality-identifying quiz.

Yes, you read that right. For those who stick around until the end, we've included a fun quiz to help you discover which Cognee retriever fits your style. (Are you a structured GraphCompletionRetriever or a free-flowing NaturalLanguageRetriever? You'll soon find out!)

Preliminaries: Cognee's Knowledge Architecture

Before we dive into retrievers, let's quickly recap the core components that power cognee's retrieval system. If you're already familiar with these concepts, feel free to skip ahead.

Tools & Platforms

cognee: That's us. Hi there! We've built cognee—an open-source AI memory framework designed to help applications understand, structure, and retrieve knowledge more effectively. cognee ingests raw data and transforms it into a dynamic, searchable knowledge graph, combining the strengths of vector and graph search to enable flexible, context-aware retrieval.
Graph Store: The database system (Kuzu, Neo4j, Memgraph, or NetworkX) that maintains cognee's knowledge graph structure, storing nodes, edges, and their properties for relationship-based queries.
Vector Store: The database system that indexes and stores vector embeddings for text elements (chunks, summaries, entities) and facilitates powerful search based on semantic similarity.

Core Concepts

Knowledge Graph: A structured representation of information as interconnected entities and relationships. In cognee, KGs form the backbone of contextual understanding.
Nodes: The points in cognee's knowledge graph that represent entities, documents, chunks, or other discrete data points. Each node contains properties that describe it.
Edges: The connections between nodes that represent relationships like "mentions," "is_related_to," or "contains." Edges define how data entities relate to each other.

cognee's Building Blocks

Documents: Raw text files, PDFs, web pages, and other content sources that cognee ingests, processes, and exposes for efficient retrieval.
Chunks: Self-contained fragments of text extracted from documents during ingestion. Typically paragraph-sized, documents are ingested chunk-by-chunk, and the data extracted is embedded and stored in graph and vector databases.
Entities: Named objects, concepts, people, places, or things identified within documents/chunks. These become nodes in cognee's knowledge graphs and can be connected to other entities or document chunks.
Summaries: Some parts of the graph are strategically condensed, and turned into summaries, which are then turned into new nodes of the graph.

A Layered Approach to Search

If you've read the preliminaries, you're all set. If you skipped ahead—no worries. This is where we start to see how it all works in practice.

Once your data is ingested using cognee.cognify(), it becomes searchable. From there, cognee.search() is your main interface. You can ask questions and get back relevant answers—whether you need summaries, direct excerpts, graph traversals, or natural language responses. If you haven't already, check out the quickstart example to see the full pipeline in action.

The cognee.search() function supports multiple search types, which define how your question gets answered. You don't need to worry about the mechanics—just specify a SearchType, and cognee handles the rest.

Under the hood, each search type is powered by a different retriever—a component that does the actual work of fetching, scoring, and formatting the result. Despite their differences, all retrievers follow the same structure, built around a shared BaseRetriever class.

This base class defines a simple two-step process:

Gather context
Return or generate an answer

This structure keeps the system modular, transparent, and easy to extend. It also makes debugging and custom development far more approachable.

Next, we'll break down how the search() function works, take a closer look at BaseRetriever, and then get to know the individual retrievers and see how each one solves a different type of problem.

1. The Search API: Your One-Line Lookup

cognee.search() is dead simple to use: just pass in a question, pick a SearchType, and get back relevant, structured results—no extra setup required.

Here's a basic example:

import cognee
from cognee import SearchType

# Simple usage example
results = await cognee.search(
    query_text="What projects did Alice work on in 2023?",
    query_type=SearchType.GRAPH_COMPLETION,
    user="john.doe@example.com",
    datasets=["company_data"]
)

The query_text is your question, and the SearchType tells cognee how to approach it. You can optionally specify a user or a list of datasets—but if you don't, cognee will fall back on sensible defaults.

Behind the scenes, search() handles quite a bit:

Prepares and validates your inputs
Selects the appropriate retriever for the given search type
Runs the retriever to gather context and generate an answer
Filters results based on user permissions

You don't need to worry about how graph queries or vector lookups are executed—cognee keeps the API simple while quietly handling all the complexity. In the next section, we'll take a look at how that complexity is structured.

2. Shared Structure for All Retrievers: BaseRetriever

Every retriever in Cognee follows the same core pattern, defined by a common interface called BaseRetriever—you'll find it in cognee/modules/retrieval/base_retriever.py.

BaseRetriever is an abstract base class. It doesn't include shared logic or state—instead, it defines the contract: two method signatures that every retriever must implement.

class BaseRetriever(ABC):
    @abstractmethod
    async def get_context(self, query: str, **kwargs) -> Any:
        """Fetch raw data relevant to the query."""
        pass

    @abstractmethod
    async def get_completion(self, query: str, context: Any, **kwargs) -> Any:
        """Process the context to produce a final result."""
        pass

This structure relies on two key processes:

get_context(query)

This method fetches the raw context needed to answer the query. That might be a list of relevant text chunks, a subgraph of related entities, or any other structure the retriever relies on. Each retriever decides how to gather and represent this internally.
get_completion(query, context)

This method takes that context and turns it into a final answer. Some retrievers call an LLM here to generate a natural language response. Others might return the raw context as-is. It all depends on the search type.

Together, these two steps give every retriever a clean and consistent structure: one method gathers information, the other makes it useful.

This simplicity is what keeps cognee's search API so lightweight. When you call cognee.search(...), the system automatically selects the right retriever and runs:

results = await retriever.get_completion(query)

If no context is provided, most retrievers will call get_context() internally. That's how cognee.search() stays a one-liner on the user's end—even if very different things are happening beneath the surface.

This two-method contract makes the system easy to reason about and easy to extend. Whether you're building a retriever for graph traversal, semantic lookup, code-aware search, or something else, it all follows the same pattern.

The Retriever Gallery: 9 Specialized Search Mechanisms

Now that we've covered the overall search architecture and the BaseRetriever structure, let's look at the actual retrievers in action. Each one is designed to handle a specific type of search—some focus on matching short text spans, others retrieve summaries, and some traverse the graph or call LLMs to generate rich, contextual answers.

In this section, we'll walk through each retriever one by one. For each, we'll explain:

What it's built for
How it gathers context
What kind of output it returns
When to use it in your application

A. SummariesRetriever: High-Level Document Overviews

Sometimes you don't need all the details—you just want to understand what a document is about. That's exactly what SummariesRetriever does.

It performs a vector search over a special summary index called TextSummary_text, which stores short, pre-written summaries for each document or chunk. When you query, cognee compares your prompt to these summaries and returns the closest matches.

There's no LLM involved—this is just a fast, lightweight lookup that returns summaries as-is. That makes it ideal when you're scanning through content and want a quick, high-level overview of it.

Use it when:

You want a fast way to explore relevant documents without reading the entire text.

# Using SummariesRetriever via the search API
summaries = await cognee.search(
    query_text="climate change impacts on agriculture",
    query_type=SearchType.SUMMARIES
)

# Each result is a concise document summary
for summary in summaries:
    print(f"Document: {summary.title}")
    print(f"Summary: {summary.text}")
    print("---")

B. ChunksRetriever: Direct Access to Source Text

If SummariesRetriever gives you the big picture, ChunksRetriever gives you the nitty gritty.

This retriever searches through the DocumentChunk_text index, which contains short, paragraph-sized fragments—aka chunks—extracted during ingestion. Using vector similarity, it finds the most relevant chunks based on your query and returns them as-is.

No LLMs, no rewriting—just raw, original text pulled straight from your documents. That makes it fast, transparent, and ideal when you want to see exactly what was written.

Use it when:

You need to read the actual content from the source, not a generated summary or rephrasing.

# Using ChunksRetriever via the search API
chunks = await cognee.search(
    query_text="machine learning applications in healthcare",
    query_type=SearchType.CHUNKS
)

# Each result is a specific text passage
for chunk in chunks:
    print(f"From document: {chunk.document_title}")
    print(f"Page/Section: {chunk.metadata.get('page', 'N/A')}")
    print(f"Text: {chunk.text[:100]}...")
    print("---")

C. CompletionRetriever: RAG-Powered Answers

CompletionRetriever follows the classic Retrieval-Augmented Generation (RAG) approach: it retrieves relevant context, then feeds it into a language model to generate a natural-language answer. This makes it useful for responses that go beyond raw data—something closer to an actual explanation.

This retriever searches the DocumentChunk_text index for the top-matching chunks, combines them into a single context block, and passes that (along with your query) to an LLM. Prompt templates help guide the model's response, keeping it focused, coherent, and readable.

You can control how much context it pulls in using the top_k parameter. Prompt templates are fully customizable, though cognee provides solid defaults out of the box.

Use it when:

You want a clear, natural-language answer based on trusted source material—perfect for building helpful, context-aware assistants.

# Using CompletionRetriever via the search API
answer = await cognee.search(
    query_text="How does quantum computing differ from classical computing?",
    query_type=SearchType.COMPLETION
)

# The result is a single generated text answer
print(answer[0])  # The LLM-generated response based on retrieved chunks

D. GraphCompletionRetriever: Reasoning Over the Knowledge Graph

GraphCompletionRetriever is built for questions that benefit from both content and structure. Instead of searching the full graph or running some complex logic, it simply pulls in relationships from the knowledge graph, providing a focused view of how different parts of the data connect.

It begins with a vector search across chunks, summaries, and entity nodes. From the top results, it identifies a relevant set of graph edges—effectively building a small subgraph related to your query.

This subgraph is then converted into plain text: each node gets a short description, and connections between nodes are clearly laid out. That structured context is passed to the LLM, which uses it to generate a coherent, context-aware answer.

Use it when:

You want a natural-language response that draws on both structured relationships and textual content—ideal for questions that depend on understanding how concepts relate.

# Using GraphCompletionRetriever via the search API
graph_answer = await cognee.search(
    query_text="What's the relationship between quantum physics and machine learning?",
    query_type=SearchType.GRAPH_COMPLETION
)

# Result is a natural language answer based on graph traversal
print(graph_answer[0])

E. GraphSummaryCompletionRetriever: Concise Answers from Complex Graphs

This retriever builds on the GraphCompletionRetriever but adds an extra layer of refinement: summarization.

When your query touches a wide or deeply connected part of the knowledge graph, the raw context can become noisy or repetitive. Instead of sending all of this to the LLM, cognee first summarizes the graph context—condensing nodes, edges, and supporting content into a shorter, more digestible form. That summary is then passed to the model to generate the final answer.

The result? A concise, natural-language response that still reflects the full depth of the data—without overwhelming the LLM or the user.

Use it when:

Your question spans many connections or concepts, and you want a high-level, structured answer that cuts through the noise.

# Using GraphSummaryCompletionRetriever via the search API
concise_answer = await cognee.search(
    query_text="Summarize the key relationships in our company's project structure",
    query_type=SearchType.GRAPH_SUMMARY_COMPLETION
)

# Result is a concise summary based on graph traversal
print(concise_answer[0])

F. GraphCompletionContextExtensionRetriever: Building a Broader Picture

This retriever starts like GraphCompletionRetriever, but iteratively pushes further. Instead of stopping at the first set of relevant graph connections, it keeps extending its view to build a more complete understanding.

After gathering the initial graph context, it prompts the LLM to suggest where to look next. That suggestion becomes a new search query. This process is repeated a few times, gradually expanding the subgraph and enriching the context with each round.

Once the broader context is assembled, the final question is posed to the LLM, which generates a well-informed answer based on the extended graph.

Use it when:

You expect the answer to require layered or indirect context. Start with GraphCompletionRetriever, and switch to this when you need deeper exploration and broader coverage.

answer = await cognee.search(
    query_text="How does NLP fit into AI?",
    query_type=SearchType.GRAPH_COMPLETION_CONTEXT_EXTENSION
)

print(answer[0])

G. GraphCompletionCotRetriever: Reasoning in Steps

This retriever builds on GraphCompletionRetriever, but introduces a chain-of-thought (CoT) approach to answer generation. Instead of producing a response in a single pass, it reasons iteratively—building context, questioning assumptions, and refining its output as it goes.

Like GraphCompletionContextExtensionRetriever, it expands the graph context over multiple steps, but with a focus on reflective reasoning. After generating an initial answer, the retriever pauses, evaluates whether the response holds up, and prompts the model to suggest a follow-up question. That follow-up triggers a new round of context gathering and response refinement.

This loop continues for several rounds, allowing the system to explore more nuanced connections and clarify its own reasoning before delivering the final answer.

Use it when:

Your question involves multiple layers of logic or requires step-by-step reasoning across different parts of the graph. Perfect for complex queries where a single pass might miss the bigger picture.

answer = await cognee.search(
    query_text="Explain how NLP relates to AI",
    query_type=SearchType.GRAPH_COMPLETION_COT
)

print(answer[0])

H. CypherSearchRetriever: Direct Graph Querying

This retriever gives you full, low-level control over the knowledge graph. Instead of relying on embeddings, vector search, or prompt-based reasoning, it lets you write raw Cypher queries and execute them directly against the database.

You write the query, cognee runs it, and returns the raw results—no LLMs, no vector search, no interpretation layer, and no abstraction in between.

To use this retriever, your graph backend must support Cypher (e.g., Neo4j). It won't work with in-memory graphs like NetworkX.

Use it when:

You know Cypher and want precise, hands-on control over what gets queried and how. Ideal for advanced users who need full transparency and specificity.

# Using CypherSearchRetriever via the search API
cypher_results = await cognee.search(
    query_text="MATCH (p:Person)-[:WORKS_ON]->(proj:Project) WHERE proj.status = 'Active' RETURN p.name, proj.name",
    query_type=SearchType.CYPHER_SEARCH
)

# Results are the raw output from the graph database query
for result in cypher_results:
    print(f"Person: {result['p.name']}, Project: {result['proj.name']}")

I. NaturalLanguageRetriever: Ask in English, Get Cypher Answers

If CypherSearchRetriever is for graph power users, this one is for everyone else. NaturalLanguageRetriever lets you ask questions in plain English. Under the hood, it actually translates your query into Cypher, runs it against the graph, and returns the results.

It uses an LLM to interpret your question, reference the graph schema, and generate a valid Cypher query. If the first attempt doesn't return results, it automatically retries with a refined query.

Like the Cypher retriever, this one requires a graph backend that supports Cypher (e.g., Neo4j). It's not compatible with in-memory graph stores like NetworkX.

Use it when:

You want to explore your graph and get precise answers—without having to write Cypher by hand.

# Using NaturalLanguageRetriever via the search API
nl_results = await cognee.search(
    query_text="Who are the top 3 contributors to our main project?",
    query_type=SearchType.NATURAL_LANGUAGE
)

# Results are structured like Cypher query results but from natural language
for result in nl_results:
    print(f"Contributor: {result['person_name']}, Contributions: {result['commit_count']}")

Well Done—You're Practically a Retriever Now!

You've made it through summaries, chunks, graphs, completions, and even a bit of step-by-step reasoning. At this point, you probably understand the cognee search system better than some of the retrievers do.

As you've likely realized, what makes the cognee machinery tick is its layered, modular approach. Instead of relying on a single type of search, it offers many—each with its own role, but all following the same simple structure. The result of that consistency? A powerful, intuitive, and easy to extend system.

From fast lookups to multi-hop graph walks, you can shift search strategies without changing how you ask. And we're not done here—new retrievers are already in testing, with more still sketched out on whiteboards. As the system evolves, so will your options.

For the end, we've prepared a special treat (pun intended)—take a quick scroll (get it? stroll 🤭) down below and get to know your inner retriever.

Quiz — Which cognee Retriever Are You?

Q1: When you open the fridge at 3 a.m., you…

A. Read every label so you know what is insideB. Grab the biggest cheese block and dig inC. Rearrange everything by shelf and categoryD. Take one thing, taste it, then look for something that pairs even better

Q2: The night before an exam, you…

A. Write tidy summaries on index cardsB. Highlight only the juiciest paragraphsC. Draft a precise study checklist and stick to itD. Start with one concept, follow its links, and build a mind-map that keeps expanding

Q3: At the school karaoke show, you…

A. Skim song overviews and pick a simple chorusB. Mash the catchiest lines into one big remixC. Queue a song only if you know every beat and lyric by heartD. Sing a verse, listen to the crowd, then pivot to another song that fits the vibe

Q4: If zombies showed up tomorrow, you…

A. Write a clear survival guide for everyoneB. Pack a bag with the absolute essentials firstC. Map the safest route in exact steps and stick to itD. Scout one block, learn what you can, then keep exploring outward