Agent Memory Architectures: Vector DBs vs Knowledge Graphs vs Long-Term Store

Reviewed: June 4, 2026

Memory is what separates a chatbot from an agent. Without memory, every interaction starts from scratch — no continuity, no personalization, no accumulated knowledge. As agents tackle longer and more complex tasks, memory architecture becomes the most consequential design decision. This post breaks down the three dominant approaches and shows you when to use each.

The Three Memory Tiers

Agent memory operates across three time horizons:

Architecture 1: Vector Database Retrieval

The most common long-term memory architecture. Conversations, documents, and facts are embedded as vectors and stored in a vector database. At query time, the agent retrieves semantically similar memories.

class VectorMemoryStore:
    def __init__(self):
        self.embedder = OpenAIEmbeddings()
        self.db = ChromaDB(persistent_path="./agent_memory")
    
    def remember(self, text, metadata=None):
        vector = self.embedder.embed(text)
        self.db.add(vector, text, metadata or {})
    
    def recall(self, query, top_k=5):
        query_vec = self.embedder.embed(query)
        return self.db.search(query_vec, k=top_k)
    
    def reflect(self, last_n=20):
        """Consolidate recent memories to reduce redundancy"""
        recent = self.db.get_last(n=last_n)
        summary = llm.summarize(recent)
        self.remember(summary, {"type": "consolidated"})

Strengths:

Weaknesses:

Architecture 2: Knowledge Graphs

Store memories as entities and relationships. The agent can traverse the graph to find indirect connections, reason about relationships, and maintain ontological structure.

class KnowledgeGraphMemory:
    def remember(self, facts):
        """facts = [{'subject': 'Alice', 'predicate': 'works_at', 'object': 'Google'}, ...]"""
        for fact in facts:
            self.graph.add_triple(fact['subject'], fact['predicate'], fact['object'])
    
    def recall(self, entity, depth=2):
        """Find everything connected to an entity within N hops"""
        return self.graph.traverse(entity, max_depth=depth)
    
    def infer(self, query):
        """Apply graph reasoning rules"""
        return self.reasoner.apply_rules(query, self.graph)

Strengths:

Weaknesses:

  • Expensive to extract structured triples from unstructured text
  • Difficult to maintain graph consistency at scale
  • SPARQL/cypher queries less flexible than semantic search
  • Slower retrieval for large graphs without good indexing

Architecture 3: Hybrid Memory Systems

Production agents increasingly use a hybrid approach — vector search for fuzzy retrieval, knowledge graphs for structured reasoning, and a lightweight key-value store for fast lookups.

class HybridAgentMemory:
    def __init__(self):
        self.vector_store = VectorMemoryStore()      # Semantic recall
        self.kg = KnowledgeGraphMemory()               # Relationship reasoning
        self.kv = KeyValueStore()                      # Fast lookups (user prefs, state)
    
    def remember(self, text, structured_facts=None):
        self.vector_store.remember(text)
        if structured_facts:
            self.kg.remember(structured_facts)
    
    def recall(self, query):
        # Parallel retrieval across all stores
        semantic_results = self.vector_store.recall(query)
        graph_results = self.kg.recall(query)
        kv_results = self.kv.get(query)
        
        # Merge and rank
        return self.fusion_ranker.merge(semantic_results, graph_results, kv_results)

Architecture 4: Memory Consolidation & Forgetting

The most overlooked aspect: agents need to consolidate and forget, just like humans.

Consolidation patterns:

  • Summarization: Compress multiple related memories into a single summary
  • Abstraction: Extract general principles from specific instances
  • Clustering: Group related memories and store the centroid

Forgetting patterns:

  • Time-based decay: Reduce retrieval score for old memories
  • Usage-based: Promote frequently accessed memories, demote unused ones
  • Relevance pruning: Remove memories that are never retrieved

Production Considerations

Concern Vector DB Knowledge Graph Hybrid
Setup complexity Low High Very High
Retrieval speed Fast (ms) Variable Moderate
Scale (millions of facts) Excellent Moderate Good
Relational reasoning None Excellent Good
Fuzzy/semantic search Excellent Poor Excellent
Explainability Low High Moderate

Recommendations

  • Simple agent, basic memory: Vector DB (Chroma or Qdrant)
  • Knowledge-intensive, relational domain: Knowledge Graph (Neo4j or Amazon Neptune)
  • Production agent with diverse memory needs: Hybrid (Vector DB + KG + KV)
  • Budget-constrained: Start with vector DB, add KG only when relational queries become critical

What’s Next

The frontier in 2027 is adaptive memory systems — agents that decide for themselves what to remember, what to consolidate, and what to forget. Early research (MemGPT, Generative Agents, Reflexion) points toward agents with increasingly human-like memory management. The teams that get memory right will build agents that genuinely improve over time — not just within a session, but across weeks and months of interaction.

Part of the Evergreen AI Guides collection.

Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert