Build graph-native RAG with cognee and Amazon Neptune Analytics

This July, at the AWS Summit in New York, AWS called attention to the need of customers to move beyond experimentation to production-ready, explainable AI systems. Retrieval-Augmented Generation (RAG) remains central to that journey—but conventional pipelines often lose context across documents and create operational overhead by means of synchronizing data across separate graph and vector stores.

With Amazon Neptune Analytics, customers can store embeddings directly on graph nodes and query them with openCypher, benefitting from both semantic recall and multi-hop graph traversal in a single managed service. In this post, we show how cognee—an open‑source AI memory engine—integrates with Neptune Analytics to deliver graph-native retrieval for RAG—without juggling two backends.

Related launches to explore:

• Amazon S3 Vectors (preview) for cost-efficient, native vector storage at scale—ideal for long-term vector retention feeding RAG.

• Amazon SageMaker AI – Nova customization to tailor foundation models across the training lifecycle, then deploy to Amazon Bedrock.

Solution overview

Model your domain once. cognee turns raw data into a semantic graph of Pydantic DataPoint objects, and lets you choose which fields to embed for semantic search.

(Partner capability for graph construction; runs anywhere.)
Store embeddings on the graph. Amazon Neptune Analytics persists embeddings on nodes and provides vector algorithms callable from openCypher (for example, top-K by embedding with filters), so you can traverse relationships and retrieve semantically similar content in the same query.
Operate one backend, not two. A single hybrid adapter points both your graph and vector configurations at Neptune Analytics—reducing moving parts and keeping security, scaling, and monitoring under a single AWS control plane (IAM, CloudWatch, etc.).

What this gets you

End-to-end GenAI readiness: With graph + vectors in one engine, teams can go from ingestion → indexing → RAG queries without extra ETL pipelines or middleware—shortening prototype-to-production timeframes.
Performance & cost efficiency: Eliminates a second vector store and cross-system synchronization, reducing infrastructure cost and lowering query latency.
Better explainability & trust: Every AI-generated answer can be traced back through the graph to its source documents and the intermediate relationships, helping with compliance and auditability.

Why Amazon Neptune Analytics

Neptune Analytics is built for investigative and analytical graph workloads and adds native vector search directly into your openCypher queries. That means you can:

Upset and query embeddings on nodes,
Run top-K similarity with filters in openCypher, and
Traverse multi-hop relationships for explainable answers—all in one place.

This approach aligns with customers building explainable RAG: semantic recall for breadth, plus graph structure for precision, provenance, and multi-hop reasoning. For model choice and tuning, Amazon SageMaker AI now supports Nova customization across pre-training and post-training (including PEFT and full fine-tuning), with seamless deployment to Amazon Bedrock—so you can pair custom models with graph-native retrieval on AWS.

For durable, cost-optimized vector retention, Amazon S3 Vectors (preview) introduces vector buckets with native APIs which can store and be used to query large vector datasets—integrated with Amazon Bedrock Knowledge Bases and Amazon OpenSearch Service for tiered vector strategies.

Use Neptune Analytics for low-latency graph + vector retrieval, and S3 Vectors for long-term, low-cost vector storage that feeds RAG and agent memory.

What is cognee

cognee is an open-source AI memory engine that builds a knowledge graph over heterogeneous data (from 30+ file formats) and generates embeddings for the fields you choose—so you can retrieve by both structure (graph traversal) and semantics (similarity). It fits cleanly into AWS RAG stacks:

Data modeling with DataPoints. You define Pydantic models; cognee turns them into nodes (and informs edges). Fields listed in the metadata index_fields are embedded for semantic search.
```
from cognee.infrastructure.engine import DataPoint

class Person(DataPoint):
    name: str
    age: int
    metadata = {"index_fields": ["name"]}
```
Pipelines & tasks. Extensible, parallel pipelines for ingestion and enrichment:
- add (ingest),
- cognify (chunk → extract → summarize → write),
- codify (code analysis → graph write).
Insert custom Python tasks as needed.
Isolation & scoping. Use node sets (e.g., team, tenant, project) to scope search and retrieval to logical groupings.
Multiple retrieval modes. 10+ search types for different needs:
- SUMMARIES — concise topical overviews
- INSIGHTS — graph‑focused relationships
- CHUNKS — precise text snippets
- RAG_COMPLETION — traditional retrieval‑augmented generation
- GRAPH_COMPLETION — graph‑aware generation
- GRAPH_SUMMARY_COMPLETION — generation over graph summaries
- GRAPH_COMPLETION_COT — chain‑of‑thought over multi‑hop traversals
- GRAPH_COMPLETION_CONTEXT_EXTENSION — iterative context expansion
- CODE — code‑artifact retrieval
- CYPHER — raw Cypher queries against graph backends
- NATURAL_LANGUAGE — natural language to Cypher
- FEELING_LUCKY — exploratory retrieval
Domain alignment (optional). Provide RDF/OWL ontologies to map entities and properties onto domain vocabularies; the resulting ontology subgraph is merged into the knowledge graph for consistency.

Where this fits on AWS

Pair cognee for graph construction with Amazon Neptune Analytics for graph + vector retrieval in openCypher with no separate vector store to operate or synchronize.
Use Amazon SageMaker AI to customize models (e.g., Amazon Nova) and deploy to Amazon Bedrock for generation.

End-to-end example: cognee + Neptune Analytics

Goal: Use Neptune Analytics as a single backend for both the graph and vectors, with cognee handling ingestion and enrichment, and run retrieval via graph traversal + similarity.

Clone the cognee repo and see cognee/notebooks/neptune-analytics-example.ipynb for the full, executable version.

Stack summary

Ingest & enrich with cognee → build graph from heterogeneous data.
Store embeddings on nodes in Amazon Neptune Analytics.
Query in openCypher with semantic similarity + traversal for explainable RAG.
(Optional) Use Amazon Bedrock for generation; use Amazon S3 Vectors for low-cost long-term vector retention feeding RAG.

Prerequisites

An Amazon Neptune Analytics instance (public connectivity or VPC access configured).
IAM permissions on the target graph identifier:
- neptune-graph:ReadDataViaQuery
- neptune-graph:WriteDataViaQuery
- neptune-graph:DeleteDataViaQuery
Vector dimension configured to your embedding model's output (e.g., 1536 for text-embedding-3-small).
AWS credentials available via environment variables, AWS profile, or explicit keys.
(Optional) Customize the Nova model and deploy to Amazon Bedrock for downstream generation.

Example environment variables:

export AWS_ACCESS_KEY_ID=your-access-key
export AWS_SECRET_ACCESS_KEY=your-secret-key
export AWS_SESSION_TOKEN=your-session-token
export AWS_DEFAULT_REGION=your-region

# Neptune Analytics graph identifier
export AWS_NEPTUNE_ANALYTICS_GRAPH_ID=g-your-graph

# OpenAI API key for cognee
export LLM_API_KEY=your-openai-api-key

Environment & directory setup

import os
import pathlib
from dotenv import load_dotenv
from cognee import config

# Load environment variables from .env (optional)
load_dotenv()

current_directory = os.getcwd()

data_directory_path = str(
    pathlib.Path(os.path.join(pathlib.Path(current_directory), ".data_storage")).resolve()
)
config.data_root_directory(data_directory_path)

cognee_directory_path = str(
    pathlib.Path(os.path.join(pathlib.Path(current_directory), ".cognee_system")).resolve()
)
config.system_root_directory(cognee_directory_path)

Configure Neptune Analytics for both graph + vectors

from cognee import config
graph_id = os.getenv("AWS_NEPTUNE_ANALYTICS_GRAPH_ID", "")

config.set_graph_db_config({
    "graph_database_provider": "neptune_analytics",
    "graph_database_url": f"neptune-graph://{graph_id}",
})
config.set_vector_db_config({
    "vector_db_provider": "neptune_analytics",
    "vector_db_url": f"neptune-graph://{graph_id}",
})

Why this matters (ops): One backend, one IAM model, unified monitoring—no cross-system sync.

Optional: clean slate

from cognee import prune
await prune.prune_data()
await prune.prune_system(metadata=True)

Ingest sample data & build the graph

from cognee import add, cognify

sample_text_1 = """Neptune Analytics is a memory-optimized graph database engine for analytics. With Neptune Analytics, you can get insights and find trends by processing large amounts of graph data in seconds. To analyze graph data quickly and easily, Neptune Analytics stores large graph datasets in memory. It supports a library of optimized graph analytic algorithms, low-latency graph queries, and vector search capabilities within graph traversals.
"""

sample_text_2 = """Neptune Analytics is an ideal choice for investigatory, exploratory, or data-science workloads that require fast iteration for data, analytical and algorithmic processing, or vector search on graph data. It complements Amazon Neptune Database, a popular managed graph database. To perform intensive analysis, you can load the data from a Neptune Database graph or snapshot into Neptune Analytics. You can also load graph data that's stored in Amazon S3.
"""

dataset_name = "neptune_descriptions"

await add([sample_text_1, sample_text_2], dataset_name)

await cognify([dataset_name])

Embeddings provider: Configure cognee to use Amazon Bedrock embeddings (e.g., Titan Text Embeddings V2) so vector dimensions align with your Neptune Analytics setting.

Visualize the graph

from cognee import visualize_graph

graph_file_path = str(
    pathlib.Path(os.path.join(pathlib.Path(current_directory), ".artifacts/graph_visualization.html")).resolve()
)
await visualize_graph(graph_file_path)

Neptune Analytics demo visualization showing graph nodes and relationships

Retrieve: graph-aware and semantic

from cognee import search, SearchType

# 1) Graph completion
graph_completion = await search(
    query_text="What is Neptune Analytics?",
    query_type=SearchType.GRAPH_COMPLETION,
)
print("\\nGraph completion result:")
print(graph_completion)

# 2) RAG completion
rag_completion = await search(
    query_text="What is Neptune Analytics?",
    query_type=SearchType.RAG_COMPLETION,
)
print("\\nRAG completion result:")
print(rag_completion)

# 3) Graph insights (relationships)
insights_results = await search(
    query_text="Neptune Analytics",
    query_type=SearchType.INSIGHTS,
)
print("\\nInsights:")
for result in insights_results:
    src_node = result[0].get("name", result[0]["type"])
    tgt_node = result[2].get("name", result[2]["type"])
    relationship = result[1].get("relationship_name", "__relationship__")
    print(f"- {src_node} -[{relationship}]-> {tgt_node}")

# 4) Summaries
summaries = await search(
    query_text="Neptune Analytics",
    query_type=SearchType.SUMMARIES,
)
print("\\nSummaries:")
for summary in summaries:
    print(f"- {summary['type']}: {summary['text']}")

# 5) Chunks
chunks = await search(
    query_text="Neptune Analytics",
    query_type=SearchType.CHUNKS,
)
print("\\nChunks:")
for chunk in chunks:
    print(f"- {chunk['type']}: {chunk['text']}")

openCypher vector search (conceptual): In Neptune Analytics, you can perform top-K by embedding with filters directly in openCypher, then continue a multi-hop traversal in the same query.

Tiered vectors (optional): Keep long-lived vectors in Amazon S3 Vectors (preview) and use OpenSearch for real-time KNN on hot subsets.

Configure Amazon Bedrock embeddings in cognee (example)

The snippet below shows a minimal way to align cognee with Amazon Bedrock for embeddings, so your embedding dimension matches the Neptune Analytics vector configuration. Treat this as a template—adjust environment variables and provider names per your environment.

import os
from cognee import config

# --- 1) Bedrock credentials/region via environment ---
# export AWS_ACCESS_KEY_ID=...
# export AWS_SECRET_ACCESS_KEY=...
# export AWS_SESSION_TOKEN=...     # if using temporary creds
# export AWS_DEFAULT_REGION=us-east-1

# Choose an embedding model available in your Region (example: Titan Text Embeddings V2)
BEDROCK_EMBED_MODEL_ID = os.getenv("BEDROCK_EMBED_MODEL_ID", "amazon.titan-embed-text-v2:0")

# --- 2) Configure cognee to use Bedrock for embeddings ---
config.set_embedding_config({
    "provider": "bedrock",                  # cognee provider key
    "model_id": BEDROCK_EMBED_MODEL_ID,     # Bedrock model ID
    "region_name": os.getenv("AWS_DEFAULT_REGION", "us-east-1"),
    # Optional controls (example):
    "batch_size": 32,
    "max_retries": 3
})

# --- 3) Align Neptune vector dimension to the model output ---
# For Titan Text Embeddings V2, verify the output dimension in docs and match Neptune settings.
# In Neptune Analytics, set the graph's vector dimension to your model's output size
# before upserting embeddings.
# (Dimension is configured on Neptune side during graph setup / query functions.)

Notes:

Confirm the output dimension of your chosen Bedrock embedding model (e.g., Amazon Titan Text Embeddings V2) and configure the Neptune Analytics vector dimension accordingly.
If you plan to tier vectors, use Amazon S3 Vectors for durable, low-cost storage integrated with Bedrock Knowledge Bases, and rely on Neptune Analytics for low-latency graph + vector retrieval.

When to use this integration

Choose cognee + Amazon Neptune Analytics when you need:

Explainable RAG. Combine semantic similarity with multi-hop reasoning to return grounded answers and show how facts connect.
A single managed backend for graph and vectors. Reduce moving parts with one IAM model and unified monitoring.
Investigative & analytical workflows. Fast iteration over connected data (exploration, hypothesis testing, relationship discovery).
Performance & cost efficiency. Co-locate embeddings with relationships to lower latency and avoid running a separate vector store.

Additional resources

The adapter implementation can be found by searching NeptuneAnalyticsAdapter in our repository
Example notebook: Neptune Analytics end‑to‑end walkthrough