Agents Don't Need Another Protocol. They Need a Good CLI.

Apr 14, 2026

4 minutes read

Apr 14, 2026

4 minutes read

Agents Don't Need Another Protocol. They Need a Good CLI.

David MyrielAI Researcher

Your agent forgets everything between sessions. Every conversation starts from zero. Past outcomes, user preferences, resolved issues — gone. You're building a system that can reason but can't learn.

The fix isn't a bigger context window or a better retrieval plugin. It's persistent memory. And the simplest way to give an agent memory is a CLI.

We built Cognee as an open-source memory engine for AI agents. Here's why we put a terminal interface in front of it — and why that choice makes your agent faster, cheaper, and better over time.

LLMs already know how to use a CLI

LLMs are trained on billions of lines of terminal interactions. Commands, flags, outputs, man pages — these patterns are deep in the weights. When an agent sees cognee-cli recall "deployment history", it doesn't need a schema to understand what that does.

Protocol-based tools work differently:

  Protocol-based tool                          CLI
  ┌─────────────────────────┐                  ┌─────────────────────────┐
  │ Session starts           │                  │ Session starts           │
  │                          │                  │                          │
  │ ┌──────────────────────┐ │                  │ (nothing loaded)         │
  │ │ Load 43 tool schemas │ │                  │                          │
  │ │ → 44,000 tokens      │ │                  │ Agent runs --help        │
  │ └──────────────────────┘ │                  │ only if needed           │
  │                          │                  │                          │
  │ Agent starts working     │                  │ Agent starts working     │
  │ (with 44K fewer tokens   │                  │ (full context available) │
  │  for reasoning)          │                  │                          │
  └─────────────────────────┘                  └─────────────────────────┘

GitHub's MCP server loads 43 tool definitions — roughly 44,000 tokens — before the agent asks a single question. A CLI loads zero. The agent pays only for what it reads.

That difference is measurable. ScaleKit's benchmarks (75 runs, March 2026) found that 800 tokens of CLI tips reduced tool calls and latency by a third each — the single biggest efficiency gain in their study. Your agent keeps more of its context window for actual reasoning instead of burning it on tool definitions.

Four commands give your agent persistent memory

cognee-cli remember "Customer 123 prefers weekly Slack summaries"
cognee-cli recall "What does customer 123 prefer?"
cognee-cli improve --dataset-name agent_memory
cognee-cli forget --dataset agent_memory --data-id <id>

An agent that can run these four commands has cross-session, graph-structured memory. No SDK integration. No server to run. No schema to inject.

We named them remember, recall, improve, forget instead of database verbs like add, search, enrich, delete — because an LLM reading a system prompt that says "use cognee-cli to remember facts and recall context" parses that instantly. The entire interface fits in 37 tokens of system prompt:

This is the entire integration. No SDK. No config files. No server.

You have access to cognee-cli for persistent memory.
Use `remember` to store facts. Use `recall` before responding.
Use `improve` after tasks. Your dataset is "project_x".

Here's what each command does for you.

remember stores knowledge. It runs the full pipeline — entity extraction, relationship detection, graph construction — in a single call. No separate "add then process" step. One command, and the data is in your knowledge graph. (The full pipeline is covered in our architecture post.)

recall retrieves context. If a session ID is present, it checks session cache first — fast path, no graph traversal. No match? It falls through to the full knowledge graph with semantic search. Your agent gets the best available context without deciding where to look.

forget deletes what shouldn't persist. Memory without a deletion mechanism is a liability.

improve is the reason we built a knowledge graph instead of a document store. After your agent acts on recalled context, it can record whether that context was useful (cognee-cli feedback add <session> <id> --score 5). When improve runs, it adjusts weights across the graph — nodes behind good answers get reinforced, nodes behind poor answers get dampened.

The result: your agent's memory gets better over time without you editing anything. Day 100 is measurably better than day 1. Same agent, same prompt, same model — what changes is the quality of what it recalls.

Note: Most agent memory is static — store chunks, retrieve chunks, get the same results regardless of whether they helped last time. improve closes that loop.

Try it

pip install cognee
cognee-cli remember "Hello, world"
cognee-cli recall "What do I know?"

Two commands from zero to a working knowledge graph. The architecture underneath — graph store, vector index, 14 search modes, multi-tenant isolation — is covered in our architecture post. The self-improving feedback loop builds on what Veljko described in Building Self-Improving Skills.

The code is at github.com/topoteretes/cognee.

Join the Discord community: Discord

Get started

Cognee is the fastest way to start building reliable Al agent memory.

Cognee Cloud

Latest

Cognee NewsJun 26, 2026

cognee 1.0: The Open-Source Memory Platform for AI Agents

cognee 1.0 is the first open-source memory platform built around a memory-native API — remember, recall, improve, forget — with full data ownership and deployment flexibility from managed cloud to edge.

Deep DivesJun 26, 2026

cognee on BEAM: SOTA Results Without a Benchmark-Specific Memory System

cognee beat SOTA on BEAM's 100k-token setting by 6.5% and matched SOTA at 10M tokens using only default open-source features — no custom benchmark-specific architecture.

Deep DivesJun 26, 2026

Just Postgres: Drop the Graph Database. Keep the Graph.

cognee 1.0 runs the full agent memory layer — graph, vectors, sessions, and metadata — on a single Postgres instance, eliminating the need for separate graph database, vector store, and Redis deployments.

Cognee NewsJun 26, 2026

cognee 1.0: The Open-Source Memory Platform for AI Agents

Deep DivesJun 26, 2026

cognee on BEAM: SOTA Results Without a Benchmark-Specific Memory System

cognee beat SOTA on BEAM's 100k-token setting by 6.5% and matched SOTA at 10M tokens using only default open-source features — no custom benchmark-specific architecture.

Deep DivesJun 26, 2026

Just Postgres: Drop the Graph Database. Keep the Graph.