Mar 26, 2026

7 minutes read

Mar 26, 2026

7 minutes read

Expanding Custom Graph Models for Reliable Agent Memory & Retrieval

Veljko KovacHead of FDE

Using LLMs for knowledge extraction has improved significantly over the last year. Tasks that used to require hand-written parsers or large annotation pipelines can now often be done with a single prompt and a good model.

The challenge, however, is that generic extraction is rarely stable enough for production — and that is why we see all over the news that AI products are still failing to pass the Proof-of-Concept stage.

A model may extract one structure today, a slightly different one next week, and yet another after your source data changes. Small shifts in schema, formatting, terminology, or data quality can lead to very different graph outputs.

These inconsistencies often go unnoticed when building the initial solution, where examples are clean and the domain is still small. In production they quickly degrade graph quality, search precision, and downstream agent performance.

That is why we strongly recommend Custom Graph Models in cognee: a practical way to tell the system exactly what kinds of entities, properties, and relationships matter in your domain.

Custom Graph Models are the mechanism we use to add that structure. Instead of asking the LLM to extract whatever graph it finds plausible, you define the entities and relationships you actually want it to produce.

In practice, a Custom Graph Model acts like a domain-specific schema for graph extraction. It tells cognee what kinds of nodes should exist, how they relate to each other, and which properties should matter for retrieval and search. This makes extraction more predictable, keeps graph outputs consistent across documents, and gives downstream pipelines a graph they can actually depend on.

Why This Matters for Teams Building Agents

Agents succeed or fail based on the quality of the structured knowledge they can reason over. Unlike basic RAG systems that rely on vector similarity alone, agents aiming for production need a true agentic memory layer which supports reliable long-term reasoning and self-improvement.

Perform confident multi-hop reasoning across related concepts
Maintain consistent long-term context across sessions
Make decisions based on stable entity and relationship patterns
Avoid hallucinations by grounding every step in memory

Generic extraction undermines all of these capabilities. Inconsistent node labels, drifting relationship types, or unexpected property schemas turn your knowledge graph into an unreliable moving target. Over time, the agent's "memory" becomes noisy and unpredictable, breaking planning logic and eroding user trust.

Custom Graph Models solve this directly. By giving you explicit control over the memory schema, they create a stable, domain-aware memory layer that agents can trust at scale. The result is higher reasoning accuracy, fewer hallucinations, and cleaner traversals.

Not Always Easy to Define the Full Schema from Day 0

From a lot of conversations we had with our users, we realized that defining a schema from day 0 is not that trivial. Usually the reason behind it is that there is no good understanding of the data, or there is a communication gap between engineers and domain experts. That is exactly the gap we wanted to address with our latest feature, cascade, which progressively does schema discovery for you.

Cascade makes this easier. Instead of requiring a complete graph model upfront, it lets you start with just a few anchor points — key entities or relationships you already know matter, or some that you have inferred with an LLM. From there, it expands and refines the structure in a guided, data-driven way. The user can provide a small set of data which is used as reference to expand the schema of the custom graph models.

Check the documentation for Custom Graph Model.

Example

To make what we talked about above more concrete, we built a small evaluation around a sampled subset of 2WikiMultihopQA. This was not meant to be a full benchmark — it was a controlled example designed to visualize one specific question:

What happens when the initial custom graph model is too narrow, and cognee's expanding graph functionality is being used to expand it from a small development set?

That makes it a good tutorial setup, because it isolates the exact benefit we want to show: a minimal custom model gives you structure but it still misses some relationships and entity types — but you can still use cognee to discover those missing parts from real data and obtain an expanded graph model that improves retrieval and reasoning on downstream questions.

Step 1: Build a minimal custom graph model

We started with a deliberately small schema containing only Person, Place, and Work. This gave us a stable first-pass graph, but intentionally left out other entities that might matter later.

class Person(DataPoint):
    name: str
    description: str

class Place(DataPoint):
    name: str
    description: str

class Work(DataPoint):
    name: str
    description: str

class BasicGraph(DataPoint):
    people: list[Person]
    places: list[Place]
    works: list[Work]

Step 2: Use a small dev set to let cognee's iterative expanding schema functionality discover what is missing

Instead of redesigning the schema manually, we gave our custom graph expanding functionality a small development set and let it discover recurring nodes, relationship types, and triplets.

The extraction flow is multi-stage:

Extract candidate nodes
Extract candidate relationship names
Extract edge triplets
Integrate them into the graph

for text in discovery_corpus:
    nodes = await extract_nodes(text, n_rounds=1)
    nodes, relationships = await extract_content_nodes_and_relationship_names(
        text, nodes, n_rounds=1
    )
    graph = await extract_edge_triplets(text, nodes, relationships, n_rounds=1)

Based on those findings, we expanded the model with new entity types and new edges. The important shift is that expanding graph functionality did not just add more fields — it surfaced entirely new entity classes that the original schema had no place for. Award, Organization, and Event became first-class nodes with their own vector indexes, making them retrievable as real graph objects instead of being buried inside free text.

class Award(DataPoint):
    name: str
    description: str
    year: Optional[str]

class Organization(DataPoint):
    name: str
    description: str
    location: Optional[Place]

class PersonExpanded(DataPoint):
    name: str
    description: str
    spouse: Optional["PersonExpanded"]
    mother: Optional["PersonExpanded"]
    father: Optional["PersonExpanded"]
    awards: Optional[List[Award]]
    member_of: Optional[List[Organization]]

class WorkExpanded(DataPoint):
    name: str
    description: str
    director: Optional[PersonExpanded]
    screenwriter: Optional[PersonExpanded]
    awards: Optional[List[Award]]

class ExpandedGraph(DataPoint):
    people: list[PersonExpanded]
    places: list[Place]
    works: list[WorkExpanded]

Step 3: Show Retrieval Improvement

We wanted to compare the number of nodes and edges, as well as retrieval performance, with and without expanding custom graph models. The questions we picked were specifically targeting entities that had not been written into the first schema by the user.

The base custom model found 372 nodes and 327 edges. After applying cognee's cascade expansion, 27 Organization nodes, 7 Award nodes, and 6 Event nodes were added.

Something to emphasize here: the value of the expansion was not just "more graph." It was the addition of the right graph objects — entity types that the original schema could not represent, but that downstream questions depended on.

We also compared performance against traditional RAG and the cognee default pipeline without any custom graph model:

Approach	F1	LLM Judge
RAG (chunks only)	0.27	0.20
cognee default	0.35	0.40
Custom graph model	0.37	0.40
Expanded Custom Graph Model	0.54	0.60

Performance increased as the custom graph model was expanded. One example that makes the benefit particularly clear is the question "Where does Karin Stoltenberg's husband work at?", where the correct answer was "United Nations." The baseline custom model failed — just like RAG and the default cognee pipeline — because none of them represented Organization strongly enough as a dedicated graph object. All three returned "Foreign Minister", which is a related title but not the answer. The expanded custom graph model was the only one that got this question right.

The full code of this example can be found here.

Get started

Cognee is the fastest way to start building reliable Al agent memory.

Cognee Cloud

Latest

FundamentalsJun 12, 2026

What Is a Knowledge Base? (and Why Most of Them Stop Working)

A knowledge base is a centralized system for storing reusable information — but most fail because of ownership gaps, drift, and no clear sense of what actually belongs in them.

FundamentalsJun 11, 2026

LLM vs Generative AI: Comparing Models, Memory, and Architecture

Generative AI and LLMs are not the same thing. Learn the real difference, why architecture matters more than model size, and what memory and retrieval actually do.

FundamentalsJun 11, 2026

Best Vector Database: Choosing for Search, RAG, and AI Memory

There's no single best vector database — the right choice depends on your retrieval workload, deployment model, and whether you need search, RAG, or full AI memory.

FundamentalsJun 12, 2026

What Is a Knowledge Base? (and Why Most of Them Stop Working)

A knowledge base is a centralized system for storing reusable information — but most fail because of ownership gaps, drift, and no clear sense of what actually belongs in them.

FundamentalsJun 11, 2026

LLM vs Generative AI: Comparing Models, Memory, and Architecture

Generative AI and LLMs are not the same thing. Learn the real difference, why architecture matters more than model size, and what memory and retrieval actually do.

FundamentalsJun 11, 2026

Best Vector Database: Choosing for Search, RAG, and AI Memory

There's no single best vector database — the right choice depends on your retrieval workload, deployment model, and whether you need search, RAG, or full AI memory.