Don't Use RAG as Memory: How to Build Real Memory for AI Agents

RAG (Vectorstore-based Retrieval-Augmented Generation) excels at knowledge retrieval, but has serious limitations when used as memory. This video identifies the problems with existing RAG and their causes, introduces a new memory approach through 'time-sensitive knowledge graphs,' and presents Graffiti -- an open-source framework that makes it easy to build this. You'll gain concrete strategies and real-world examples for how agents can intelligently remember and make judgments based on contextual changes over time.

1. The Problem with Existing RAG-Based Memory

Daniel opens the talk by introducing himself as the founder of Zep and explains that he has been building memory infrastructure for AI agents. Then he gets straight to the point:

"You're all using memory completely wrong."

Of course, it's not really the audience's fault -- the problem lies in the structural issues of the widely used frameworks, especially vector database-based RAG.

Why Memory Matters

AI agents often fail to properly utilize dynamic information (user behavior, preference changes, in-app business data, etc.) gathered from conversations with users. As a result:

"The agent can't remember customers, always gives obvious or irrelevant answers, and ultimately loses customers."

In other words, it's not enough to simply find similar content -- agents need to remember "context and changes that evolve over time" to deliver real value.

2. Why Does RAG Give Wrong Answers?

Daniel illustrates using Adidas shoes as an example. Initially, the user was happy with an Adidas recommendation, but became disappointed when the shoes fell apart. After that, they came to prefer Puma. However, when the question "recommend sneakers for me" is repeated, vector-based RAG keeps pulling up the old 'Adidas preference' information as the 'most similar answer.'

"Each fact is stored as an independent, immutable fragment with no connections to others. Vector RAG has no causal relationships, no passage of time."

RAG simply cannot reflect these changes along with cause-and-effect (why preferences changed) and timing (when they changed). Even though the user's mind has changed, the old data scores as 'most similar,' so the same recommendations keep coming.

3. The New Solution: Knowledge Graphs and Graffiti

Here Daniel emphasizes clearly:

"Graphs can handle the 'why' and the 'when.'"

The Core of the Graffiti Framework

Graffiti is an open-source framework created by Zep that:

Can build knowledge graphs that reflect real-time dynamic/temporal information
Records not just simple information connections but also the passage of time, state changes, and causal relationships

"Graffiti tracks when each fact is valid and when it has been invalidated."

For example, when the event of disappointment with Adidas shoes occurs:

The 'Adidas preference' relationship is invalidated
A new 'Puma preference' is added But past data is not deleted -- it remains as information that 'this is how things used to be,' enabling tracking of changes over time!

Operating Like Human Memory

"This way, it becomes much closer to the way humans remember changes and experiences in sequence."

4. How Does Graffiti Work in Practice?

Daniel shows Graffiti's operation in more detail.

Graffiti still combines vector embeddings (semantic search) and BM25 text search to smartly extract the needed 'subgraph' from the entire large-scale graph.
Graph traversal functionality quickly follows related information and causal relationships to provide richer context.

"By combining these, you can search quickly, accurately, and with structurally rich information."

Especially since the data structures that need to be remembered differ completely across business domains (e.g., e-commerce, mental health apps, etc.):

"Using Graffiti's custom entities and edges, you can easily design memory structures (ontologies) tailored to each domain's characteristics."

The ability to extract only the desired information while excluding 'unintended noise' is a major strength.

5. When Are RAG, Graph RAG, and Graffiti Each Appropriate?

Daniel compares the pros, cons, and use cases of RAG, Graph RAG, and Graffiti without claiming any single approach is perfect.

RAG: Still useful for simple static information retrieval
Graph RAG: Slow and expensive when graph summarization is frequently needed (LLM repeats summarization, causing significant delays)
Graffiti: Powerful for continuously changing data (customer behavior, preferences, real-time changes, etc.)
- Fast lookups (tens to hundreds of milliseconds)
- Flexible business domain modeling
- LLM usage optimization (emphasizing fast processing)

"The key is properly understanding and choosing which approach is needed for each situation."

6. Q&A: Practical Technical Questions

6.1. Standalone Use of Graffiti

"Graffiti is open source and available to anyone on GitHub, even without Zep. You can easily connect it to partner platforms like Neo4j and start using it right away!"

6.2. LLM, Unstructured Data Processing, Fact Invalidation

"Graffiti uses LLMs to parse unstructured conversations, emails, JSON, and various other data into the knowledge graph." "When facts change or conflict, the LLM identifies even emotions (disappointment, etc.) and causal relationships to handle invalidation."

6.3. Reverting to Past States (Re-validation)

"Depending on the context, you can update with new facts (edges) or restore using invalidation dates (nullify)."

6.4. Comparison with Microsoft GraphRAG: Why Real-Time Is Possible

"GraphRAG summarizes across multiple stages, and when data changes, all summaries need to be redone. Graffiti precisely updates only the small parts that are affected, making it far more efficient in both cost and time."

6.5. Ontology Generation Methods

"Graffiti can automatically generate ontologies, or users can directly define their desired entities/edges to create customized memory structures."

7. Closing

The talk concludes firmly:

"Agent memory is not just about knowledge retrieval -- the core is the ability to track and understand time and the reasons behind relationships."

Through time-and-relationship-aware graphs like Graffiti:

Sensitively remember user behavior changes, preference shifts, and state transitions
Upgrade AI agent memory capabilities to be fast, rich, and perfectly suited to the business

At the close of the talk, links to presentation materials and papers were shared:

"Zep builds an agent brain that integrates conversations and business data, providing a real-time panoramic view of the entire user. It can proactively solve complex problems too!"

Conclusion

Existing RAG-based memory is only suitable for 'static knowledge retrieval' and has significant limitations as real memory (tracking changes, causal/relational connections, passage of time)
The 'knowledge graph-based' Graffiti approach, with its time sensitivity and clear relational connections, is a practical and powerful tool for creating AI with human-like memory
Graffiti is open source and anyone can easily experiment with it, so if you're thinking about memory structures suited to your business, it's highly recommended to give it a try

1. The Problem with Existing RAG-Based Memory

Daniel opens the talk by introducing himself as the founder of Zep and explains that he has been building memory infrastructure for AI agents. Then he gets straight to the point:

"You're all using memory completely wrong."

Of course, it's not really the audience's fault -- the problem lies in the structural issues of the widely used frameworks, especially vector database-based RAG.

Why Memory Matters

AI agents often fail to properly utilize dynamic information (user behavior, preference changes, in-app business data, etc.) gathered from conversations with users. As a result:

"The agent can't remember customers, always gives obvious or irrelevant answers, and ultimately loses customers."

In other words, it's not enough to simply find similar content -- agents need to remember "context and changes that evolve over time" to deliver real value.

2. Why Does RAG Give Wrong Answers?

"Each fact is stored as an independent, immutable fragment with no connections to others. Vector RAG has no causal relationships, no passage of time."

3. The New Solution: Knowledge Graphs and Graffiti

Here Daniel emphasizes clearly:

"Graphs can handle the 'why' and the 'when.'"

The Core of the Graffiti Framework

Graffiti is an open-source framework created by Zep that:

Can build knowledge graphs that reflect real-time dynamic/temporal information
Records not just simple information connections but also the passage of time, state changes, and causal relationships

"Graffiti tracks when each fact is valid and when it has been invalidated."

For example, when the event of disappointment with Adidas shoes occurs:

The 'Adidas preference' relationship is invalidated
A new 'Puma preference' is added But past data is not deleted -- it remains as information that 'this is how things used to be,' enabling tracking of changes over time!

Operating Like Human Memory

"This way, it becomes much closer to the way humans remember changes and experiences in sequence."

4. How Does Graffiti Work in Practice?

Daniel shows Graffiti's operation in more detail.

Graffiti still combines vector embeddings (semantic search) and BM25 text search to smartly extract the needed 'subgraph' from the entire large-scale graph.
Graph traversal functionality quickly follows related information and causal relationships to provide richer context.

"By combining these, you can search quickly, accurately, and with structurally rich information."

Especially since the data structures that need to be remembered differ completely across business domains (e.g., e-commerce, mental health apps, etc.):

"Using Graffiti's custom entities and edges, you can easily design memory structures (ontologies) tailored to each domain's characteristics."

The ability to extract only the desired information while excluding 'unintended noise' is a major strength.

5. When Are RAG, Graph RAG, and Graffiti Each Appropriate?

Daniel compares the pros, cons, and use cases of RAG, Graph RAG, and Graffiti without claiming any single approach is perfect.

RAG: Still useful for simple static information retrieval
Graph RAG: Slow and expensive when graph summarization is frequently needed (LLM repeats summarization, causing significant delays)
Graffiti: Powerful for continuously changing data (customer behavior, preferences, real-time changes, etc.)
- Fast lookups (tens to hundreds of milliseconds)
- Flexible business domain modeling
- LLM usage optimization (emphasizing fast processing)

"The key is properly understanding and choosing which approach is needed for each situation."

6. Q&A: Practical Technical Questions

6.1. Standalone Use of Graffiti

"Graffiti is open source and available to anyone on GitHub, even without Zep. You can easily connect it to partner platforms like Neo4j and start using it right away!"

6.2. LLM, Unstructured Data Processing, Fact Invalidation

"Graffiti uses LLMs to parse unstructured conversations, emails, JSON, and various other data into the knowledge graph." "When facts change or conflict, the LLM identifies even emotions (disappointment, etc.) and causal relationships to handle invalidation."

6.3. Reverting to Past States (Re-validation)

"Depending on the context, you can update with new facts (edges) or restore using invalidation dates (nullify)."

6.4. Comparison with Microsoft GraphRAG: Why Real-Time Is Possible

"GraphRAG summarizes across multiple stages, and when data changes, all summaries need to be redone. Graffiti precisely updates only the small parts that are affected, making it far more efficient in both cost and time."

6.5. Ontology Generation Methods

"Graffiti can automatically generate ontologies, or users can directly define their desired entities/edges to create customized memory structures."

7. Closing

The talk concludes firmly:

"Agent memory is not just about knowledge retrieval -- the core is the ability to track and understand time and the reasons behind relationships."

Through time-and-relationship-aware graphs like Graffiti:

Sensitively remember user behavior changes, preference shifts, and state transitions
Upgrade AI agent memory capabilities to be fast, rich, and perfectly suited to the business

At the close of the talk, links to presentation materials and papers were shared:

"Zep builds an agent brain that integrates conversations and business data, providing a real-time panoramic view of the entire user. It can proactively solve complex problems too!"

Conclusion

Existing RAG-based memory is only suitable for 'static knowledge retrieval' and has significant limitations as real memory (tracking changes, causal/relational connections, passage of time)
The 'knowledge graph-based' Graffiti approach, with its time sensitivity and clear relational connections, is a practical and powerful tool for creating AI with human-like memory
Graffiti is open source and anyone can easily experiment with it, so if you're thinking about memory structures suited to your business, it's highly recommended to give it a try

1. The Problem with Existing RAG-Based Memory

Why Memory Matters

2. Why Does RAG Give Wrong Answers?

3. The New Solution: Knowledge Graphs and Graffiti

The Core of the Graffiti Framework

Operating Like Human Memory

4. How Does Graffiti Work in Practice?

5. When Are RAG, Graph RAG, and Graffiti Each Appropriate?

6. Q&A: Practical Technical Questions

6.1. Standalone Use of Graffiti

6.2. LLM, Unstructured Data Processing, Fact Invalidation

6.3. Reverting to Past States (Re-validation)

6.4. Comparison with Microsoft GraphRAG: Why Real-Time Is Possible

6.5. Ontology Generation Methods

7. Closing

Conclusion

Related writing

Understanding Society Through Simulation: Simile's Joon Sung Park

Vibe Coding University Member Debuts as AX Consultant

Midjourney Full-Body Ultrasound: From Images to Outcomes

Reading

1. The Problem with Existing RAG-Based Memory

Why Memory Matters

2. Why Does RAG Give Wrong Answers?

3. The New Solution: Knowledge Graphs and Graffiti

The Core of the Graffiti Framework

Operating Like Human Memory

4. How Does Graffiti Work in Practice?

5. When Are RAG, Graph RAG, and Graffiti Each Appropriate?

6. Q&A: Practical Technical Questions

6.1. Standalone Use of Graffiti

6.2. LLM, Unstructured Data Processing, Fact Invalidation

6.3. Reverting to Past States (Re-validation)

6.4. Comparison with Microsoft GraphRAG: Why Real-Time Is Possible

6.5. Ontology Generation Methods

7. Closing

Conclusion

Related writing

Understanding Society Through Simulation: Simile's Joon Sung Park

Vibe Coding University Member Debuts as AX Consultant

Midjourney Full-Body Ultrasound: From Images to Outcomes