Skip to content

Instantly share code, notes, and snippets.

@bar181
Last active October 26, 2025 04:18
Show Gist options
  • Select an option

  • Save bar181/8461a1f35b290bd300f1352665ddaa24 to your computer and use it in GitHub Desktop.

Select an option

Save bar181/8461a1f35b290bd300f1352665ddaa24 to your computer and use it in GitHub Desktop.
Bradley’s advanced Agentic memory and learning database for Agentic engineering. Simplified description with end to end example for the Bradley.academy Harvard style course design explaining the Lego piece (proof based Agentic coding) approach for software development. Prose version (aisp - ai symbolic protocol not included). End to end user sto…

The Cognitive Substrate: A Deep Dive into the AISP-WE Memory & Learning Architecture

Author: Bradley Ross Date: October 25, 2025

This document provides a comprehensive overview of the AISP-WE system's Stage 2 memory and learning architecture. This is not merely a database; it is a cognitive substrate designed to enable a state of continuous, compounding learning, transforming the system from a generative tool into a self-improving agentic entity.

1. Purpose and North Star: From Brute Force to a "Lego Piece" System

The primary purpose of this architecture is to create a learning flywheel. Stage 1 agentic systems are amnesiac, relying on the brute-force intelligence of an LLM for every task. This is inefficient, expensive, and fails to capture institutional knowledge.

Our North Star is a "Lego piece" system for autonomous development. The goal is to move from a generative to a compositional process, where a vast library of verifiably correct "Lego pieces" (Gold Standard patterns, code blocks, and workflows) are the primary form of creation. This architecture is the factory, library, and quality control line for these Lego pieces.

Long-Term Goals:

  • 60%+ LLM Token Reduction: By composing existing, high-quality patterns instead of generating new logic from scratch.
  • 50%+ "Edit vs. Build": Shift the majority of development work to composing and editing these proven "Lego pieces."
  • <20% "Unknowns": For a given domain (e.g., Python/TypeScript), have the vast majority of any new project be built from a library of 1,000+ certified, near-proof-level blocks.

2. The Two-Database Cognitive Architecture

The architecture is built on a strategic separation of concerns into two distinct SQLite databases, mirroring the distinction between working memory and long-term memory in cognitive science.

Database 1: memory.db (The Operational Brain / The Logbook)

  • Purpose: To handle the high-frequency, transient data of active agentic operations. It logs the "what is happening now."
  • Characteristics: WRITE-HEAVY, optimized for speed. Data has a short retention period (30-90 days) to remain lean and performant.
  • Contents: sessions, tasks, subtasks, events.

Database 2: learning.db (The Cognitive Brain / The Library)

  • Purpose: To serve as the system's permanent, long-term memory. It stores the "what the system knows."
  • Characteristics: READ-HEAVY, optimized for complex, semantic queries. Data is permanent, curated, and versioned.
  • Contents: learned_patterns, gold_standard_registry, execution_history_new, embeddings, and 24+ other supporting tables.

The rationale for this separation is performance isolation. The constant, high-frequency writes to memory.db will not create database locks that could slow down the critical, time-sensitive knowledge retrieval queries from learning.db.

3. How It Works: The Closed-Loop Learning Cycle (The "Ω" Algorithms)

The system operates on a continuous, four-step learning loop inspired by academic research (ReasoningBank).

Step 1: Ω_Retrieve (Consult the Library) Before starting any task, the system queries its knowledge base. It uses vector embeddings to perform a semantic search, finding relevant patterns based on the intent of the task, not just keywords. This provides the agent with proven templates and strategies, immediately priming it for a compositional "edit vs. build" approach.

  • LLM/Token Impact: This step adds a small, fixed cost upfront (one API call to embed the query), but offers massive downstream savings. By injecting a 500-token Gold Standard pattern, it can prevent the generation of 2,000+ tokens of new, unverified code.
    • Estimated Token Savings: 50-75% per task where a relevant pattern is found.
    • Verification: A/B test 100 tasks, half with retrieval and half without. Measure the total tokens consumed by the agent for each group. The difference will quantify the savings.

Step 2: Ω_Execute (Perform the Work) The agent performs the task, guided by the retrieved patterns. Its every action, decision, and output is meticulously recorded as a trajectory_json in the execution_history_new table. This creates a high-fidelity "flight data recording" of the entire process.

  • LLM/Token Impact: The agent's own LLM calls are logged here. The total token count provides the baseline for measuring the effectiveness of retrieval.

Step 3: Ω_Judge (Evaluate the Outcome) After execution, a specialized "Judge" agent (an LLM with a strict evaluation rubric) analyzes the execution history. It assigns a binary verdict (success or failure) and a confidence score. This provides the crucial feedback signal for learning.

  • LLM/Token Impact: This step requires one LLM call per task run. It's a necessary cost for enabling automated learning.
    • Estimated Accuracy: >80% alignment with human judgment on success/failure.
    • Verification: Have a human evaluator judge 100 task outcomes independently. Compare the human labels to the AI Judge's labels to calculate the accuracy percentage.

Step 4: Ω_Distill & Ω_Consolidate (Learn and Organize) If a task was a high-confidence success, the Ω_Distill algorithm uses an LLM to analyze the execution history and extract a new, generalizable "Lego piece" (a learned_pattern). This new knowledge is then added to the learning.db. Periodically, the Ω_Consolidate process runs to find duplicates and contradictions, keeping the library clean.

  • LLM/Token Impact: Ω_Distill requires one LLM call per successful learning. This is the cost of creating new intellectual assets.
    • Estimated Learning Rate: 10-20 new, high-quality patterns extracted per week on an active system.
    • Verification: Run the system for one week under a typical workload and count the number of new entries in the learned_patterns table with an initial confidence > 0.6.

4. Other Key Information on This System

  • Bayesian Confidence: The learned_patterns table includes a data-driven confidence score that automatically updates based on usage. A successful application of a pattern increases its confidence (+0.05), while a failure decreases it (-0.10). This ensures that only the most reliable "Lego pieces" survive and are prioritized over time.
  • Gold Standard Governance: The system includes a formal gold_standard_registry. This is not just a flag; it's a comprehensive governance layer that tracks the promotion, validation history, and impact metrics of the most elite, near-proof-level patterns.
  • Backward Compatibility: The entire system is implemented through a compatibility layer and a hooks system. This means the 34 existing agents require zero code changes to benefit from this new cognitive architecture.

User Story Example: Creating a 400-Word Lesson

To illustrate how this system works in practice, consider the following user story. This demonstrates how the different table categories work in concert to fulfill a request and, crucially, to learn from the experience.

The Analogy: The AI is an Apprentice Chef

Think of the entire AI system as a brilliant but young apprentice chef working in a massive, ever-evolving kitchen.

  • **The learning.db is the Kitchen's Grand Library of Recipes.

    • learned_patterns is the main Cookbook, filled with recipes (patterns) of varying quality.
    • gold_standard_registry is the special, locked section of the library for Michelin-Star Recipes (Gold Standards).
    • embeddings is the Magical Index Card System that lets the chef find a recipe by describing the taste of a dish, not just its name.
    • execution_history_new is the Chef's Personal Notebook, where every attempt to cook something is meticulously logged, including notes on what went right or wrong.
  • **The memory.db is the Busy Prep Station for the current shift.

    • sessions and tasks are the Order Tickets coming in from the front of the house.
    • task_runs is the Live Action Log for the current dish being prepared, tracking every chop, stir, and sear in real-time.

With that in mind, let's walk through your user story.


User Story: Create a 400-Word Lesson on "Python Decorators"

Here is the step-by-step journey through the databases, showing exactly which tables are used and why.

Step 1: The Order Comes In (The User's Request)

The user asks: "Create a 400-word lesson on Python Decorators for a beginner's course."

  • Tables Used:
    • memory.db -> sessions: A new entry is created to track this entire work session.
    • memory.db -> tasks: A new entry is created for the specific task: "Create a lesson on Python Decorators."

Step 2: The Apprentice Chef Consults the Library (Ω_Retrieve)

Before starting, the system needs to find the best way to write a lesson. It doesn't want to start from scratch.

  • Tables Used:
    • learning.db -> learned_pattern_embeddings: The system takes the user's request, creates a vector embedding (the "taste profile"), and searches this table to find the ID of recipes that are semantically similar. It might find patterns related to "writing beginner lessons," "explaining complex code concepts," or even a previous lesson on a similar topic.
    • learning.db -> learned_patterns: Using the IDs from the search, the system fetches the full "recipes"—the actual content of the most relevant patterns.
    • learning.db -> gold_standard_registry: It cross-references the retrieved patterns with this table to see if any of them are "Michelin-Star" quality, giving them higher priority.

Step 3: The Chef Starts Cooking (Agent Execution)

Now, with the best recipes and templates in mind, the agent (our chef) begins to write the 400-word lesson.

  • Tables Used:
    • memory.db -> task_runs: A new entry is created. As the agent writes the lesson, every significant action (e.g., "Wrote introduction," "Created code example," "Wrote conclusion") is logged here in the trajectory_json field. This is the real-time log of the cooking process.

Step 4: The Head Chef Tastes the Dish (Ω_Judge)

The lesson is written. Now, an automated process needs to evaluate its quality.

  • Tables Used:
    • learning.db -> execution_history_new: A permanent, detailed record of the completed task_run is logged here.
    • memory.db -> outcomes: A "Judge" agent reads the final lesson from the execution history, evaluates it against a rubric (e.g., "Is it 400 words? Is it clear? Is the code correct?"), and writes a verdict (success) and a confidence score (0.95) into this table.

Step 5: A New Recipe is Born (Ω_Distill)

The Head Chef loved the dish and thinks the technique is worth saving for the future.

  • Tables Used:
    • learning.db -> learned_patterns: The "Distill" agent analyzes the successful execution log and extracts a new, generalizable pattern from it. For example, it might create a new recipe titled: "Beginner's Guide to Explaining a Python Concept." The content of the lesson becomes the template for this new pattern. This new recipe is added to the main cookbook.
    • learning.db -> learned_pattern_embeddings: An embedding (the "taste profile") is immediately created for this new recipe and stored here, so it can be found in future searches.

Step 6 (Optional - Later): The Librarians Organize the Library (Ω_Consolidate)

Weeks later, the system might notice that the new recipe is very similar to an older one.

  • Tables Used:
    • learning.db -> pattern_relationships: The "Consolidate" process runs, compares the embeddings of all recipes, and finds the two similar ones. It then creates a new entry in this table, linking them with a relationship like refines or duplicates, keeping the library clean and organized.

User Story Summary: How Many Tables Were Used?

For the core workflow of a user wanting to create a single 400-word lesson, the system actively used 10 distinct tables across both databases.

  1. sessions (memory.db) - To track the overall request.
  2. tasks (memory.db) - To define the specific task.
  3. learned_pattern_embeddings (learning.db) - To find relevant knowledge.
  4. learned_patterns (learning.db) - To retrieve the knowledge and later save a new learning.
  5. gold_standard_registry (learning.db) - To prioritize the best knowledge.
  6. task_runs (memory.db) - To log the live creation process.
  7. execution_history_new (learning.db) - To create a permanent record of the work.
  8. outcomes (memory.db) - To store the quality verdict.
  9. pattern_relationships (learning.db) - (Used later) To keep the knowledge base organized.
  10. skill_usage (learning.db) - (Implicitly used) The system would also log that the "lesson creation" skill was used.

This shows that while there are 29+ tables in total, a specific set of them work together in a logical and efficient sequence to fulfill the user's request and, crucially, to learn from the experience.

In this single user story, 10 of the 32+ tables in the cognitive architecture were actively used, demonstrating the deep integration of the learning and operational components. This is the flywheel in action: a request is fulfilled more efficiently thanks to past knowledge, and the successful completion of the request generates new knowledge that will make the system even more efficient for the next task.


Special thanks to Ruv and his open source research on memory and learning systems - search GitHub ruvnet For great resources.

Aisp (ai symbolic protocol) is a neural symbolic language developed by Bradley Ross. Initial, preliminary tests show this prose-based system presented can be improved upon exponentially using the aisp approach with primary gains in ambiguity reduction, more efficient agent communication, and significant reduction in tokens and false positive pattern recognition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment