🆕 Haystack 2.29 is here! Hybrid search with MultiRetriever and TextEmbeddingRetriever

Using Mem0 Memory Store with Haystack Agents


Mem0 is a managed memory layer for AI agents. Instead of passing entire conversation histories to an LLM on every turn, Mem0 intelligently extracts and compresses key facts from conversations into optimized memory representations.

At a high level, Mem0 manages a cycle of extraction, consolidation, and retrieval. When new messages arrive, relevant facts are identified and stored. Over time, memories are merged, updated, or allowed to fade if they lose relevance. When the agent later needs context, Mem0 surfaces only the memories most relevant to the current query, helping to keep token usage and latency low.

In this notebook, we will:

  1. Set up a Mem0MemoryStore and add memories about a user
  2. Inspect what Mem0 actually stored
  3. Create a Haystack Agent with Mem0 memory tools
  4. Ask the Agent personalized questions and see how it leverages stored memories

Note: The Mem0 integration now lives in the mem0-haystack package in Haystack Core Integrations. The integration provides Mem0MemoryStore, Mem0MemoryRetriever, Mem0MemoryWriter, and ready-made Agent tools.

Install the required dependencies

We need haystack-ai for the core framework and mem0-haystack for the Mem0 memory store, pipeline components, and Agent tools.

!pip install haystack-ai mem0-haystack

Set up API keys

This notebook requires two API keys:

  • OPENAI_API_KEY: Used by the OpenAIChatGenerator to power the Agent’s LLM.
  • MEM0_API_KEY: Used by Mem0MemoryStore to connect to the Mem0 Platform. You can get a free API key by signing up.
import os
from getpass import getpass

if not os.environ.get("OPENAI_API_KEY"):
    os.environ["OPENAI_API_KEY"] = getpass("Enter your OpenAI API key: ")

if not os.environ.get("MEM0_API_KEY"):
    os.environ["MEM0_API_KEY"] = getpass("Enter your Mem0 API key: ")
Enter your OpenAI API key:  ········
Enter your Mem0 API key:  ········

Create the Mem0 Memory Store

The Mem0MemoryStore connects to the Mem0 Platform using the MEM0_API_KEY environment variable. Each memory is associated with a user_id, which allows Mem0 to maintain separate memory spaces for different users.

We will add several facts about a user - their preferences, background, and work context. Mem0 will extract the key facts from these messages and store them as structured memories. When the Agent later queries the memory store, only the most relevant memories will be retrieved rather than replaying the full conversation history.

from haystack.dataclasses import ChatMessage
from haystack_integrations.memory_stores.mem0 import Mem0MemoryStore

USER_ID = "agent_example"

memory_store = Mem0MemoryStore()

messages = [
    ChatMessage.from_user("I like to listen to Russian pop music"),
    ChatMessage.from_user("I liked cold spanish latte with oat milk"),
    ChatMessage.from_user("I live in Florence Italy and I love mountains"),
    ChatMessage.from_user(
        "I am a software engineer and I like building application in python. "
        "Most of my projects are related to NLP and LLM agents. "
        "I find it easier to use Haystack framework to build my projects."
    ),
    ChatMessage.from_user(
        "I work in a startup and I am the CEO of the company. "
        "I have a team of 10 people and we are building a platform "
        "for small businesses to manage their customers and sales."
    ),
]

memory_store.add_memories(user_id=USER_ID, messages=messages, infer=True)
[]

Inspect Stored Memories

Before connecting the memory store to an Agent, let’s see what Mem0 actually stored. The search_memories method lets us query the memory store and see which memories are retrieved for a given query. This is useful for understanding how Mem0 extracts and condenses information from raw messages.

NOTE: We may need to wait some time for the inferred memories to become available (usually no longer than 1 minute)

results = memory_store.search_memories(
    query="What programming tools does this person use?",
    user_id=USER_ID,
    top_k=3,
)

for message in results:
    print(f"- {message.text}\n")
- User works as a software engineer specializing in Python, focusing on natural language processing and large language model agents, and prefers using the Haystack framework to develop these projects.

- User is the CEO of a startup, leading a team of ten people to build a platform that helps small businesses manage their customers and sales.

- User enjoys listening to Russian pop music, often seeking upbeat tracks that reflect contemporary Russian pop culture.

Notice how Mem0 has rephrased the original messages into facts. Instead of storing the full sentences, it extracts the key information and returns the memories most relevant to the query.

Create the Agent with Memory Tools

Now we create a Haystack Agent that uses an OpenAIChatGenerator as its LLM and two Mem0 tools:

  1. retrieve_memories searches for relevant memories, or retrieves all memories when no query is provided.
  2. store_memory writes durable user facts, preferences, and context back to Mem0.

The user_id is passed through Agent State. This lets a single Agent instance serve multiple users while keeping each user’s memories scoped correctly.

from haystack.components.agents import Agent
from haystack.components.generators.chat.openai import OpenAIChatGenerator
from haystack.components.generators.utils import print_streaming_chunk
from haystack_integrations.tools.mem0 import Mem0MemoryRetrieverTool, Mem0MemoryWriterTool

retriever_tool = Mem0MemoryRetrieverTool(memory_store=memory_store, top_k=10)
writer_tool = Mem0MemoryWriterTool(memory_store=memory_store)

agent = Agent(
    chat_generator=OpenAIChatGenerator(model="gpt-5.4"),
    tools=[retriever_tool, writer_tool],
    system_prompt="""You are a helpful assistant with long-term memory.

Use the memory tools deliberately:
- At the beginning of every turn, first call retrieve_memories without a query to inspect all scoped memories.
- Use the retrieved memories to answer, decide whether a focused memory search is still needed, and decide whether anything new should be stored.
- When the user shares durable, user-specific facts or preferences, call store_memory before your final answer.
- Before storing, compare the proposed memory with the memories you just retrieved. Do not store duplicate facts.
- Do not store transient requests or facts that are only useful inside the current conversation.
""",
    streaming_callback=print_streaming_chunk,
    state_schema={"user_id": {"type": str}},
)

Ask a Personalized Question

Let’s ask the Agent a question that it can only answer well if it remembers the user’s preferences. Based on the stored memories, the Agent knows the user is a Python developer who uses Haystack and works on NLP/LLM projects - so it should tailor its recommendation accordingly.

Note the user_id runtime parameter: this is how the Agent State passes the user scope into the Mem0 tools. Behind the scenes, the Agent can call retrieve_memories, inject the retrieved memories into the conversation as context, and then generate a response.

agent.run(
    messages=[
        ChatMessage.from_user(
            "Based on what you know about me, which framework should I use to design an AI travel agent?"
        )
    ],
    user_id=USER_ID,
);
[TOOL CALL]
Tool: retrieve_memories 
Arguments: {"top_k":50}

[TOOL RESULT]
- User resides in Florence, Italy, and has a strong affection for mountains, often expressing a desire to explore alpine landscapes nearby.
- User enjoys listening to Russian pop music, often seeking upbeat tracks that reflect contemporary Russian pop culture.
- User works as a software engineer specializing in Python, focusing on natural language processing and large language model agents, and prefers using the Haystack framework to develop these projects.
- User is the CEO of a startup, leading a team of ten people to build a platform that helps small businesses manage their customers and sales.
- User liked a cold Spanish latte made with oat milk, appreciating its refreshing flavor and dairy‑free profile.

[ASSISTANT]
Given what I know about you, I’d recommend **Haystack** first.

Why:
- You’re a **Python engineer** working on **NLP and LLM agents**.
- You already **prefer Haystack** for this kind of work.
- An AI travel agent is a strong fit for **retrieval + tools + orchestration**, which Haystack handles well.

Why Haystack fits a travel agent:
- **RAG pipelines** for destination info, policies, guides, reviews, and internal travel data
- **Agent/tool calling** for search, itinerary building, booking APIs, maps, weather, and pricing
- **Good Python ecosystem fit**, which matches your background
- Easier to move from prototype to a structured production pipeline

A practical stack for you:
- **Haystack** as the core orchestration framework
- **FastAPI** for the backend/API
- **Postgres + pgvector** or another vector store for travel knowledge
- External tools/APIs for:
  - flights/hotels
  - maps/geocoding
  - weather
  - calendars
  - payments/booking if needed

When I would *not* choose Haystack:
- If you want a very lightweight, code-first agent loop with minimal framework abstractions, you might prefer **LangGraph**.
- If your main need is multi-step deterministic workflow control with complex state machines, **LangGraph** can be stronger.

My bottom line:
- **Use Haystack** if you want the framework most aligned with your existing skills and preferences.
- **Use LangGraph** only if the travel agent is going to be highly workflow-heavy, with lots of branching, approvals, and long-running stateful sessions.

If you want, I can also give you:
1. a **Haystack-based architecture** for an AI travel agent, or  
2. a **Haystack vs LangGraph comparison** specifically for your startup context.

Ask a Follow-Up Question

Let’s try a different kind of question. This time, the user shares a new durable fact about vacation timing and asks for a recommendation based on personal preferences.

agent.run(
    messages=[
        ChatMessage.from_user(
            "Please remember that I plan to have some time off in June this year. "
            "Can you suggest a vacation destination for me?"
        )
    ],
    user_id=USER_ID,
);
[TOOL CALL]
Tool: retrieve_memories 
Arguments: {"query":null}

[TOOL RESULT]
- User resides in Florence, Italy, and has a strong affection for mountains, often expressing a desire to explore alpine landscapes nearby.
- User enjoys listening to Russian pop music, often seeking upbeat tracks that reflect contemporary Russian pop culture.
- User works as a software engineer specializing in Python, focusing on natural language processing and large language model agents, and prefers using the Haystack framework to develop these projects.
- User is the CEO of a startup, leading a team of ten people to build a platform that helps small businesses manage their customers and sales.
- User liked a cold Spanish latte made with oat milk, appreciating its refreshing flavor and dairy‑free profile.

[TOOL CALL]
Tool: store_memory 
Arguments: {"text":"The user plans to have some time off in June this year.","infer":false}

[TOOL RESULT]
Stored 1 memory item(s).

[ASSISTANT]
Since you’re in Florence and love the mountains, I’d suggest the **Dolomites** for a June vacation.

Why it fits:
- **Easy enough from Florence** compared with more faraway trips
- **Great in June**: pleasant weather, green valleys, wildflowers, and usually fewer crowds than peak summer
- **Perfect for mountain lovers**: dramatic alpine scenery, hiking, cable cars, mountain huts, and beautiful villages

A few especially nice areas:
- **Val Gardena** – great base, scenic, lots of hikes
- **Alta Badia** – beautiful and slightly more relaxed feel
- **Cortina d’Ampezzo** – iconic views and a bit more upscale
- **Alpe di Siusi** – amazing meadows and gentler walking routes

If you want something more relaxing than hiking-heavy, you could also consider:
- **Lake Bled + Julian Alps, Slovenia**
- **South Tyrol**
- **Aosta Valley**

If you want, I can give you:
1. a **mountain-focused recommendation**,  
2. a **relaxing nature getaway**, or  
3. a **3–5 day June itinerary** from Florence.

The Agent should reference the user’s love of mountains and their location in Florence, Italy, demonstrating that Mem0 retrieves different memories depending on the query context. A question about frameworks surfaces technical memories, while a question about vacations surfaces lifestyle memories.

Because the Agent has access to store_memory, it can also store new durable user facts. Mem0 may need a few seconds to make newly written memories searchable, so we wait briefly before inspecting the vacation-related memories.

results = memory_store.search_memories(
    query="When do I have planned vacation?",
    user_id=USER_ID,
    top_k=1,
)

for message in results:
    print(f"- {message.text}")
- The user plans to have some time off in June this year.

Memory vs. Retrieval

If you have used Haystack for RAG, you might wonder how a memory store differs from a document retriever. While both supply additional context to the LLM, they serve different purposes:

  • What is stored - A retriever pulls chunks from a knowledge base built on top of your documents. A memory store holds distilled facts learned from conversations, like user preferences, past decisions, and stated goals, but not raw documents.
  • Who it belongs to - Retrieved documents are typically shared across all users. Memories are scoped to a specific user, agent, app, or run, enabling personalization.
  • How it evolves - A document store changes only when you explicitly index new content. A memory store evolves as the agent converses: new facts can be extracted, conflicting ones can be updated, and low-relevance memories can decay over time.

In practice, the two are complementary. A Haystack Agent can use a retriever to answer factual questions from your knowledge base while using Mem0 to remember who it is talking to and what they care about.

Clean Up

Since memories are stored in your Mem0 account, let’s clean up the memories we created in this notebook to avoid leaving orphaned data. The Haystack memory store exposes the underlying Mem0 client, which supports scoped deletion by user_id.

memory_store.client.delete_all(user_id=USER_ID)
print(f"Deleted memories for {USER_ID}.")
Deleted memories for agent_example.