Skip to content

Instantly share code, notes, and snippets.

@thistleknot
Created January 26, 2026 00:43
Show Gist options
  • Select an option

  • Save thistleknot/731c9b94ca94a3c8dbd42dd560f65ab4 to your computer and use it in GitHub Desktop.

Select an option

Save thistleknot/731c9b94ca94a3c8dbd42dd560f65ab4 to your computer and use it in GitHub Desktop.
Agentic Memory.md

Agentic Memory Approaches: Comprehensive Taxonomy

Executive Summary

This document synthesizes research across 10 major agentic memory architectures, providing implementation details, performance benchmarks, and comparative analysis for production deployment.


Table of Contents

  1. A-MEM (Zettelkasten-based)
  2. Reflexion (Verbal RL)
  3. MemGPT (OS-inspired)
  4. DSPy (Declarative Pipelines)
  5. Mem0 (Hybrid Memory Layer)
  6. AWM (Workflow Memory)
  7. StateFlow (FSM Control)
  8. ADAS/Meta Agent Search
  9. Generative Agents (Simulacra)
  10. MemoryBank (Ebbinghaus-inspired)

1. A-MEM - Zettelkasten-based Dynamic Memory

Paper: Xu et al., arXiv 2502.12110 (Feb 2025) Repository: https://github.com/agiresearch/A-mem

Core Concept

Dynamic memory organization inspired by the Zettelkasten method - creating interconnected knowledge networks through atomic notes and flexible linking.

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                         A-MEM System                             │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  New Memory ──▶ Note Construction ──▶ Link Generation ──▶ Store │
│                      │                      │                    │
│                      ▼                      ▼                    │
│              ┌─────────────┐        ┌─────────────┐             │
│              │  Attributes │        │   Analyze   │             │
│              │  - content  │        │  Historical │             │
│              │  - keywords │        │   Memories  │             │
│              │  - tags     │        └─────────────┘             │
│              │  - context  │              │                      │
│              │  - embedding│              ▼                      │
│              └─────────────┘        Memory Evolution             │
│                                    (Update existing)             │
└─────────────────────────────────────────────────────────────────┘

Memory Schema

class MemoryNote:
    id: str              # Unique identifier
    content: str         # Raw memory content
    keywords: List[str]  # Extracted keywords
    tags: List[str]      # Categorical tags
    context: str         # LLM-generated contextual description
    embedding: np.array  # all-MiniLM-L6-v2 vector
    links: List[str]     # Connected memory IDs
    created_at: datetime
    updated_at: datetime

Link Generation Algorithm

def generate_links(new_memory, historical_memories, threshold=0.7):
    """Two-stage link generation: semantic + LLM decision"""
    
    # Stage 1: Cosine similarity filtering
    candidates = []
    for mem in historical_memories:
        sim = cosine_similarity(new_memory.embedding, mem.embedding)
        if sim > threshold:
            candidates.append((mem, sim))
    
    # Stage 2: LLM decision for meaningful connections
    links = []
    for candidate, sim in candidates:
        prompt = f"""Determine if these memories should be linked:
        Memory 1: {new_memory.content}
        Memory 2: {candidate.content}
        Should link? (yes/no):"""
        
        if llm_call(prompt).strip().lower() == "yes":
            links.append(candidate.id)
    
    return links

Memory Evolution

When new memories are added, existing memories can be updated:

  • Context descriptions refined based on new connections
  • Tags/keywords expanded from new insights
  • Link weights adjusted based on access patterns

Performance

Benchmark A-MEM MemGPT Improvement
Single Hop 0.85 0.72 +18%
Multi Hop 0.78 0.39 +100%
Token Usage 1,200-2,500 16,900 85-93% reduction

Key Insight

Ablation shows Link Generation (LG) + Memory Evolution (ME) provide 2.8x improvement on Single Hop tasks.


2. Reflexion - Verbal Reinforcement Learning

Paper: Shinn et al., NeurIPS 2023 (arXiv 2303.11366)

Core Concept

Self-reflection through verbal reinforcement - the agent generates natural language feedback about its failures and uses this to improve subsequent attempts.

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                    Reflexion Loop                                │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│   ┌─────────┐    ┌───────────┐    ┌──────────────┐              │
│   │  Actor  │───▶│ Evaluator │───▶│ Self-Reflect │              │
│   └─────────┘    └───────────┘    └──────────────┘              │
│        ▲                                  │                      │
│        │                                  ▼                      │
│        │                          ┌──────────────┐              │
│        └──────────────────────────│   Memory     │              │
│                                   │ (Reflections)│              │
│                                   └──────────────┘              │
└─────────────────────────────────────────────────────────────────┘

Two-Call Pattern (Code Generation)

# Call 1: Error Identification
error_prompt = f"""
Previous attempt failed with error:
{error_message}

Code that failed:
{failed_code}

Identify what went wrong and why.
"""
error_analysis = llm_call(error_prompt)

# Call 2: Implementation Correction
fix_prompt = f"""
Based on this analysis:
{error_analysis}

Previous reflections:
{memory.get_reflections()}

Generate corrected implementation:
"""
corrected_code = llm_call(fix_prompt)

Memory Structure

  • Sliding window: ~3 most recent reflections
  • Natural language format: Human-readable failure analysis
  • Accumulative: Reflections build on prior insights

Performance

Task Reflexion Base GPT-4
HumanEval 88% pass@1 67%
ALFWorld 97% 75%
HotPotQA 77% 62%

3. MemGPT - OS-inspired Virtual Memory

Paper: Packer et al., arXiv 2310.08560 (Oct 2023) Framework: Letta (https://github.com/letta-ai/letta)

Core Concept

Operating system metaphor - LLM manages its own memory through function calls, paging information between tiers as needed.

Memory Tiers

┌─────────────────────────────────────────────────────────────────┐
│                    MemGPT Memory Hierarchy                       │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │                    CORE MEMORY                               ││
│  │  ┌──────────────────┐  ┌──────────────────┐                 ││
│  │  │  Persona Block   │  │   User Block     │                 ││
│  │  │  (Agent identity)│  │  (User profile)  │                 ││
│  │  └──────────────────┘  └──────────────────┘                 ││
│  └─────────────────────────────────────────────────────────────┘│
│                              │                                   │
│                              ▼                                   │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │                   RECALL MEMORY                              ││
│  │            (Conversation history buffer)                     ││
│  └─────────────────────────────────────────────────────────────┘│
│                              │                                   │
│                              ▼                                   │
│  ┌─────────────────────────────────────────────────────────────┐│
│  │                  ARCHIVAL MEMORY                             ││
│  │              (Unlimited external storage)                    ││
│  └─────────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────────┘

Memory Management Functions

# Core memory operations
def core_memory_append(label: str, content: str):
    """Add to persona or user block"""
    
def core_memory_replace(label: str, old: str, new: str):
    """Edit existing core memory"""

# Archival operations  
def archival_memory_insert(content: str):
    """Store in long-term archival"""
    
def archival_memory_search(query: str, page: int = 0):
    """Retrieve from archival with pagination"""

# Conversation operations
def conversation_search(query: str, page: int = 0):
    """Search conversation history"""

Heartbeat Mechanism

# Agent can request additional processing steps
response = agent.step(user_message)

if response.request_heartbeat:
    # Continue processing without user input
    response = agent.step(None)  # Internal continuation

Context Budget: ~16,900 tokens


4. DSPy - Declarative Self-Improving Python

Paper: Khattab et al., NeurIPS 2023 (arXiv 2310.03714) Repository: https://github.com/stanfordnlp/dspy

Core Concept

Programming model that abstracts LM pipelines as text transformation graphs. Treats prompting as an optimization problem rather than manual engineering.

Three Abstractions

# 1. SIGNATURES - Declarative I/O
"question -> answer"

class RAG(dspy.Signature):
    """Answer questions with retrieved context."""
    context = dspy.InputField()
    question = dspy.InputField()
    answer = dspy.OutputField()

# 2. MODULES - Parameterized components
class MyRAG(dspy.Module):
    def __init__(self):
        self.retrieve = dspy.Retrieve(k=3)
        self.generate = dspy.ChainOfThought(RAG)
    
    def forward(self, question):
        context = self.retrieve(question)
        return self.generate(context=context, question=question)

# 3. TELEPROMPTERS - Optimization algorithms
optimizer = dspy.MIPROv2()
optimized_rag = optimizer.compile(
    MyRAG(),
    trainset=train_data,
    metric=accuracy
)

MIPRO Compiler Stages

  1. Grounded Proposal: Generate candidate instructions/demonstrations
  2. Discrete Search: Explore combinations
  3. Surrogate Model: Learn to predict quality

Performance

Model Before After Task
GPT-3.5 33% 82% Case Study 1
GPT-3.5 32% 46% Case Study 2
Llama2-13b 9% 47% Case Study 1
T5-770M - Competitive with GPT-3.5 General

Compilation time: Minutes to tens of minutes


5. Mem0 - Universal Memory Layer

Paper: Chhikara et al., arXiv 2504.19413 (April 2025) Repository: https://github.com/mem0ai/mem0

Core Concept

Hybrid data store combining vector database + graph database + key-value store for comprehensive memory management.

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                     Mem0 Architecture                            │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌───────────────┐  ┌───────────────┐  ┌───────────────┐        │
│  │ Vector Store  │  │  Graph Store  │  │   KV Store    │        │
│  │  (Semantic)   │  │ (Relations)   │  │   (Facts)     │        │
│  └───────────────┘  └───────────────┘  └───────────────┘        │
│           │                 │                  │                 │
│           └─────────────────┴──────────────────┘                 │
│                            │                                     │
│                            ▼                                     │
│                   ┌─────────────────┐                           │
│                   │  LLM Processor  │                           │
│                   │ (CRUD + Update) │                           │
│                   └─────────────────┘                           │
└─────────────────────────────────────────────────────────────────┘

Memory Operations

Operation Description
ADD Insert new fact
UPDATE Modify existing memory
DELETE Remove obsolete
NO-OP Skip duplicate

Mem0ᵍ Graph Pipeline

# Extraction Phase
entities = EntityExtractor(message)  # → Nodes
relations = RelationsGenerator(entity_pairs)  # → Labeled edges
# Edge labels: 'lives_in', 'prefers', 'owns', 'happened_on'

# Update Phase  
conflicts = ConflictDetector(new_triples, existing_graph)
updates = UpdateResolver(conflicts)  # → {add, merge, invalidate, skip}
# Marks invalid rather than deleting (temporal reasoning)

# Dual Retrieval
# 1. Entity-centric: entity → similarity → traverse → subgraph
# 2. Semantic triplet: query embedding → match triplet embeddings

Code Example

from mem0 import Memory

config = {
    "graph_store": {
        "provider": "neo4j",
        "config": {
            "url": os.environ["NEO4J_URL"],
            "username": "neo4j",
            "password": os.environ["NEO4J_PASSWORD"]
        }
    }
}

memory = Memory.from_config(config)
memory.add("Alice met Bob at GraphConf 2025", user_id="demo-user")
results = memory.search("Who did Alice meet?", user_id="demo-user")

Memory Scopes

Scope Persistence Use Case
user_id Across all conversations User preferences
session_id Single conversation Current context
agent_id Per agent instance Agent-specific knowledge

Performance (LOCOMO Benchmark)

  • 26% higher accuracy vs OpenAI memory
  • 91% lower p95 latency vs full-context
  • 90% token savings
  • Mem0ᵍ: ~2% higher overall score than base Mem0

6. AWM - Agent Workflow Memory

Paper: Wang et al., arXiv 2409.07429 (Sep 2024) Repository: https://github.com/zorazrw/agent-workflow-memory

Core Concept

Induces reusable workflows from agent trajectories, stores in memory, and guides future task-solving through hierarchical composition.

Architecture

I(E_train) → W_offline    # Offline: induce from training examples
L(q, M+W, o_test) → a_test # Utilize workflows at inference

Two Modes

Mode Description Best For
Offline Pre-induce workflows from training set Known task distributions
Online Streaming induce→integrate→utilize Novel task discovery

Workflow Representation

class Workflow:
    name: str           # "find_place_by_name"
    description: str    # Natural language
    steps: List[Action] # Primitive action sequence

# Hierarchical composition:
# Level 1 (Primitives): "click", "type"
# Level 2 (Induced): "find_place_by_name" 
# Level 3 (Composite): "get_place_zipcode" (uses Level 2)

Snowball Effect (Online Mode)

Query 1 → Solve → Induce W₁ → Memory
Query 2 → Solve (with W₁) → Induce W₂ (builds on W₁) → Memory
...
# Simple workflows become building blocks for complex ones

Performance

Benchmark AWM vs Baseline Improvement
Mind2Web +24.6% Relative success rate
WebArena +51.1% Relative success rate
Steps 7.9 → 5.9 Average steps reduced

7. StateFlow - Finite State Machine Control

Paper: Wu et al., arXiv 2403.11322 (Mar 2024) Integration: AutoGen GroupChat

Core Concept

Models LLM workflows as Finite State Machines, distinguishing process grounding (states/transitions) from sub-task solving (actions within states).

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                    StateFlow FSM Model                           │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│   ┌───────┐    success    ┌─────────┐    success   ┌─────────┐  │
│   │ Init  │──────────────▶│ Observe │─────────────▶│  Solve  │  │
│   └───────┘               └─────────┘              └─────────┘  │
│                                 │                       │        │
│                                 │ error                 │ error  │
│                                 ▼                       ▼        │
│                           ┌─────────┐             ┌─────────┐   │
│                           │  Error  │◀────────────│ Verify  │   │
│                           └─────────┘             └─────────┘   │
│                                                        │        │
│                                                        │success │
│                                                        ▼        │
│                                                   ┌─────────┐   │
│                                                   │   End   │   │
│                                                   └─────────┘   │
└─────────────────────────────────────────────────────────────────┘

Two Variants

Variant Description Context Management
StateFlow Single LLM, different instructions per state Instructions in context
SF_Agent Different LLM agents per state No context bloat

AutoGen Implementation

import autogen

def state_transition(last_speaker, groupchat):
    messages = groupchat.messages
    
    if last_speaker is initializer:
        return coder  # Init → Retrieve
    elif last_speaker is coder:
        return executor  # Retrieve action
    elif last_speaker is executor:
        if "exitcode: 1" in messages[-1]["content"]:
            return coder  # Error → Retry
        else:
            return scientist  # Success → Research
    elif last_speaker == scientist:
        return None  # Research → End

groupchat = autogen.GroupChat(
    agents=[initializer, coder, executor, scientist],
    messages=[],
    max_round=20,
    speaker_selection_method=state_transition,
)

Performance

Benchmark vs ReAct Cost Reduction
InterCode SQL +13% 5x less
ALFWorld +28% 3x less

Key Insight: Combines with Reflexion for further improvement.


8. ADAS - Automated Design of Agentic Systems

Paper: Hu et al., ICLR 2025 (arXiv 2408.08435) Repository: https://github.com/ShengranHu/ADAS

Core Concept

Meta Agent Search - a meta agent iteratively programs new agents in code, evaluates them, and builds an archive of discoveries to inform future iterations.

Three Components

┌─────────────────────────────────────────────────────────────────┐
│              ADAS Components                                     │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  1. SEARCH SPACE: Code (Turing Complete)                        │
│     → Can represent ANY agentic system                          │
│                                                                  │
│  2. SEARCH ALGORITHM: Meta Agent Search                         │
│     Archive → Meta Agent → New Code → Evaluate → Archive        │
│                                                                  │
│  3. EVALUATION: Task accuracy on benchmarks                     │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Algorithm

def meta_agent_search(iterations=25):
    # Seed with baseline agents
    archive = [CoT, CoT_SC, SelfRefine, LLM_Debate]
    
    for i in range(iterations):
        # Meta agent (GPT-4) generates new agent code
        agent_code = meta_agent.generate(
            archive=archive,
            instruction="Create novel, interesting agent"
        )
        
        # Evaluate on benchmark (using GPT-3.5)
        performance = evaluate(agent_code, benchmark)
        
        # Add if novel and performant
        if is_novel(agent_code, archive) and performance > threshold:
            archive.add(agent_code, performance)
    
    return best_agent(archive)

Performance

Domain Improvement
DROP (F1) +13.6/100
MGSM (accuracy) +14.4%
GSM8K (transfer) +25.9%
GSM-Hard (transfer) +13.2%

Key Finding: Transferability

Agents discovered in math domain transfer to:

  • Reading comprehension
  • Science questions
  • Multi-task problems

This suggests ADAS discovers general design patterns, not task-specific tricks.


9. Generative Agents - Interactive Simulacra

Paper: Park et al., UIST 2023 (arXiv 2304.03442) Repository: https://github.com/joonspk-research/generative_agents

Core Concept

Computational agents that simulate believable human behavior through memory streams, reflection, and planning - demonstrated in a Sims-like sandbox environment.

Architecture

┌─────────────────────────────────────────────────────────────────┐
│               Generative Agent Architecture                      │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│   Perception ──▶ Memory Stream ──▶ Retrieval ──▶ Action         │
│                       │              ▲                           │
│                       ▼              │                           │
│                  Reflection ─────────┘                           │
│                       │                                          │
│                       ▼                                          │
│                   Planning                                       │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Memory Stream

Complete record of agent experiences in natural language:

class MemoryObject:
    description: str     # Natural language observation
    created_at: datetime
    last_accessed: datetime
    importance: float    # 1-10 scale (LLM-assigned)

Retrieval Function

def retrieve(agent, query, k=10):
    """Retrieve memories based on recency, importance, relevance"""
    
    for memory in agent.memory_stream:
        # Recency: exponential decay
        recency = exp(-decay * hours_since_access(memory))
        
        # Importance: LLM-assigned score
        importance = memory.importance / 10
        
        # Relevance: embedding similarity
        relevance = cosine_sim(embed(query), embed(memory.description))
        
        # Combined score (equal weighting)
        memory.score = (recency + importance + relevance) / 3
    
    return sorted(memories, key=lambda m: m.score)[:k]

Reflection Process

Triggered when sum of importance scores exceeds threshold (~150):

def reflect(agent):
    # 1. Get recent memories
    recent = agent.memory_stream[-100:]
    
    # 2. Generate salient questions
    questions = llm_call(f"""
    Given these observations:
    {recent}
    
    What are 3 most salient high-level questions?
    """)
    
    # 3. Retrieve relevant memories per question
    for question in questions:
        relevant = retrieve(agent, question)
        
        # 4. Generate insights
        insight = llm_call(f"""
        Statements: {relevant}
        Question: {question}
        
        What 5 high-level insights can you infer?
        """)
        
        # 5. Store reflection as new memory
        agent.memory_stream.add(MemoryObject(
            description=insight,
            importance=8,  # Reflections are important
            is_reflection=True
        ))

Emergent Behaviors

From a single seed ("Isabella wants to throw a Valentine's Day party"):

  • Agents autonomously spread invitations over 2 days
  • Made new acquaintances
  • Asked each other on dates
  • Coordinated to arrive together at correct time

Ablation Results

Condition Believability Score
Full architecture Best
No reflection Significant drop
No planning Significant drop
No observation Worst

10. MemoryBank - Ebbinghaus-inspired Long-Term Memory

Paper: Zhong et al., AAAI 2024 (arXiv 2305.10250) Repository: https://github.com/zhongwanjun/MemoryBank-SiliconFriend

Core Concept

Human-like memory mechanism inspired by Ebbinghaus Forgetting Curve - memories decay over time but are reinforced through access and importance.

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                   MemoryBank Architecture                        │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  Conversation ──▶ Event Extraction ──▶ Memory Storage            │
│                                             │                    │
│                                             ▼                    │
│                                   ┌─────────────────┐           │
│                                   │ Ebbinghaus Decay│           │
│                                   │   + Importance  │           │
│                                   └─────────────────┘           │
│                                             │                    │
│                                             ▼                    │
│  Response ◀── Memory Retrieval ◀── User Portrait                │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Forgetting Curve Implementation

def memory_strength(memory, current_time):
    """Ebbinghaus-inspired decay with importance weighting"""
    
    # Base retention from forgetting curve
    time_elapsed = current_time - memory.last_accessed
    retention = exp(-time_elapsed / memory.stability)
    
    # Importance factor
    importance_weight = memory.importance / 10
    
    # Reinforcement from access count
    reinforcement = log(1 + memory.access_count)
    
    return retention * importance_weight * reinforcement

Memory Components

Component Description
Event Summaries Extracted key events from conversations
User Portrait Synthesized personality understanding
Memory Index Encoded representations for retrieval

SiliconFriend Application

AI companion chatbot tuned on 38K psychological dialogs:

  • Empathetic responses
  • Personality understanding
  • Long-term relationship building

Performance

Metric SiliconFriend Baseline
Memory Retrieval Accuracy High -
Response Correctness Improved -
Empathy Score Significantly higher -

Compatibility

Works with both:

  • Closed-source: ChatGPT
  • Open-source: ChatGLM, BELLE

Comparative Analysis

Memory Type Comparison

Method Memory Type Organization Retrieval Evolution
A-MEM Knowledge graph Dynamic linking Semantic + graph LLM-driven updates
Reflexion Episodic Verbal summaries Context injection Accumulates
MemGPT Tiered (OS-style) Main/External Function calls Self-editing
DSPy Compiled traces Teleprompter opt N/A (compiled) Optimization
Mem0 Hybrid (V+G+KV) Entity relations Dual retrieval LLM CRUD
AWM Workflow sequences Hierarchical Rule/LM-based Snowball
StateFlow Context history FSM states State-based N/A
ADAS Agent code Archive Meta-level Iterative
Generative Agents Memory stream Time-indexed Recency+Importance+Relevance Reflection
MemoryBank Episodic + Portrait Ebbinghaus decay Importance-weighted Forgetting curve

Token Efficiency Comparison

Method Typical Context Usage Efficiency
A-MEM 1,200-2,500 Best
Mem0 90% savings Excellent
StateFlow 3-5x reduction Excellent
MemoryBank Variable Good
MemGPT 16,900 Moderate
Reflexion Variable Moderate

Implementation Complexity

Method Complexity Dependencies Best Starting Point
Reflexion Low LLM only Immediate
MemoryBank Low LLM + storage Immediate
Mem0 Low-Medium Neo4j + vector DB Production apps
DSPy Medium DSPy library Pipeline optimization
A-MEM Medium ChromaDB + LLM Knowledge-intensive
StateFlow Medium AutoGen Sequential tasks
MemGPT Medium Letta framework Long conversations
AWM Medium Custom impl Web automation
Generative Agents High Custom sandbox Simulation research
ADAS High Meta-agent infra Agent discovery research

Implementation Recommendations

By Use Case

Need Recommended Why
Production user memory Mem0 26% accuracy boost, production-ready
Complex reasoning chains A-MEM 2x better multi-hop performance
Web automation AWM 51% improvement on WebArena
Sequential task control StateFlow 5x cost reduction
Pipeline optimization DSPy 33%→82% quality improvement
Long-term companionship MemoryBank Ebbinghaus-based, empathetic
Social simulation Generative Agents Emergent behaviors
Learning from failures Reflexion Simple, effective
Unbounded context MemGPT OS-inspired virtual memory
Discovering new architectures ADAS Meta-agent search

Quick Start Priority

  1. Start with Reflexion - Simplest to implement, immediate benefits
  2. Add Mem0 - Production-ready persistent memory
  3. Integrate StateFlow - When control flow matters
  4. Consider A-MEM - For knowledge-intensive applications
  5. Explore ADAS - For cutting-edge agent discovery

References

Paper ID Title Year
2502.12110 A-MEM: Agentic Memory for LLM Agents 2025
2303.11366 Reflexion: Verbal Reinforcement Learning 2023
2310.08560 MemGPT: Virtual Memory for LLMs 2023
2310.03714 DSPy: Compiling Declarative Language Programs 2023
2504.19413 Mem0: Universal Memory Layer 2025
2409.07429 Agent Workflow Memory 2024
2403.11322 StateFlow: State-Driven Workflows 2024
2408.08435 ADAS: Automated Design of Agentic Systems 2025
2304.03442 Generative Agents: Interactive Simulacra 2023
2305.10250 MemoryBank: Long-Term Memory for LLMs 2023

Generated: January 2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment