mirror of https://github.com/sdwolf4103/opencode-working-memory.git synced 2026-06-02 06:19:36 +02:00

Files

T

Ralph Chang 32d5ddfb50 Initial commit: OpenCode Working Memory Plugin v1.0.0

- Four-tier memory architecture (Core, Working, Pruning, Pressure)
- Phase 1: Core Memory blocks (goal/progress/context)
- Phase 2: Smart Pruning with adaptive thresholds
- Phase 3: Working Memory with slots + pool decay
- Phase 4: Pressure monitoring with interventions
- Phase 4.5: Storage governance (session cleanup + cache sweep)
- Complete documentation (README, AGENTS, installation, architecture, configuration)
- MIT licensed

2026-02-18 09:49:09 +08:00

12 KiB

Raw Blame History

Architecture Documentation

Overview

The Working Memory Plugin implements a four-tier memory architecture designed to maximize context efficiency for AI agents in OpenCode sessions.

┌─────────────────────────────────────────────────────────────┐
│                    TIER 1: CORE MEMORY                       │
│  Persistent blocks: goal (1000) | progress (2000) | context (1500) │
│  Survives compaction, always visible in system prompt       │
└─────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────┐
│                  TIER 2: WORKING MEMORY                      │
│  Session-scoped slots + memory pool                         │
│  Slots: error(3) | decision(5) | todo(3) | dependency(3)   │
│  Pool: Exponential decay (γ=0.85) + mention tracking        │
└─────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────┐
│                   TIER 3: SMART PRUNING                      │
│  Filters tool outputs before adding to conversation         │
│  Removes: file lists, verbose logs, repetitive content      │
│  Modes: normal → aggressive → hyper-aggressive               │
└─────────────────────────────────────────────────────────────┘
                              ↓
┌─────────────────────────────────────────────────────────────┐
│                 TIER 4: PRESSURE MONITORING                  │
│  Tracks context usage: safe → moderate → high → critical    │
│  Thresholds: 70% | 85% | 95%                                │
│  Intervention: Sends promptAsync() with full visible prompt │
└─────────────────────────────────────────────────────────────┘

Phase 1: Core Memory Foundation

Purpose

Provide persistent memory blocks that survive conversation compaction and are always injected into the system prompt.

Storage

Location: .opencode/memory-core/<sessionID>.json

Schema:

{
  sessionID: string;
  blocks: {
    goal: { content: string; chars: number; maxChars: 1000; updatedAt: string };
    progress: { content: string; chars: number; maxChars: 2000; updatedAt: string };
    context: { content: string; chars: number; maxChars: 1500; updatedAt: string };
  };
  updatedAt: string;
}

Character Limits

goal: 1000 chars (ONE specific task)
progress: 2000 chars (done/in-progress/blocked checklist)
context: 1500 chars (current working files + key patterns)

Operations

replace: Completely replace block content
append: Add content to end (auto-adds newline)

Tools

core_memory_update: Update or append to blocks
core_memory_read: Read current state of all blocks

System Prompt Injection

Blocks are injected into every agent message as:

<core_memory>
<goal chars="87/1000">...</goal>
<progress chars="560/2000">...</progress>
<context chars="479/1500">...</context>
</core_memory>

Phase 2: Smart Pruning

Purpose

Reduce context bloat by filtering tool outputs before they enter the conversation history.

Pruning Modes

Normal Mode (Pressure < 85%)

Remove file/directory listings > 50 lines
Truncate verbose tool outputs
Keep first/last 30 lines of long outputs
Preserve error messages and key information

Aggressive Mode (85% ≤ Pressure < 95%)

Threshold drops to 30 lines
More aggressive truncation (first/last 20 lines)
Filter repetitive content

Hyper-Aggressive Mode (Pressure ≥ 95%)

Threshold drops to 15 lines
Keep only first/last 10 lines
Maximum compression

Pruning Heuristics

File Listings: Detect ls, find, glob outputs
Directory Trees: Detect tree-like structures with /
Log Files: Detect timestamp patterns, stack traces
Repetitive Content: Detect similar consecutive lines
Synthetic Content: Preserve synthetic: true markers

Implementation

Pruning happens in tool.execute.after hook before tool output enters conversation.

Phase 3: Working Memory

Purpose

Provide session-scoped memory with structured slots and a general-purpose pool with intelligent decay.

Storage

Location: .opencode/memory-working/<sessionID>.json

Schema:

{
  sessionID: string;
  slots: {
    error: Array<WorkingMemoryItem>;      // Max 3
    decision: Array<WorkingMemoryItem>;   // Max 5
    todo: Array<WorkingMemoryItem>;       // Max 3
    dependency: Array<WorkingMemoryItem>; // Max 3
  };
  pool: Array<WorkingMemoryItem>;
  eventCounter: number;
  updatedAt: string;
}

Slot Types

Slot	Max Items	Purpose
error	3	Recent errors that need fixing
decision	5	Important decisions made
todo	3	Current task checklist
dependency	3	File/package dependencies

Memory Pool

General-purpose storage with exponential decay:

score = exp(-γ * age) + mentionCount

Where:

γ = 0.85 (decay rate, 15% per event)
age = eventCounter - item.eventNumber
mentionCount: Number of times item mentioned in conversation

Items with score < 0.01 are pruned.

Auto-Extraction

Working memory items are automatically extracted from:

Tool outputs (file paths, errors, dependencies)
User messages (decisions, todos)
Assistant responses (key information)

Manual Management

Tools:

working_memory_add: Manually add item
working_memory_clear: Clear all items
working_memory_clear_slot: Clear specific slot (e.g., after fixing all errors)
working_memory_remove: Remove specific item by content match

System Prompt Injection

<working_memory>
Recent session context (auto-managed, sorted by relevance):

⚠️ Errors:
  - TypeError at line 42 in utils.ts
  - Missing import in index.ts

📁 Key Files:
  - src/components/Button.tsx
  - src/utils/helpers.ts

(15 items shown, updated: 9:46:47 AM)
</working_memory>

Phase 4: Pressure Monitoring

Purpose

Track conversation context usage and trigger interventions when approaching limits.

Pressure Calculation

pressure = (visiblePromptChars / estimatedContextLimit) * 100

Where:

visiblePromptChars: Total characters in system prompt + tool outputs
estimatedContextLimit: ~180,000 chars (conservative estimate)

Pressure Levels

Level	Threshold	Behavior
safe	< 70%	Normal operation
moderate	70-84%	Warning in system prompt
high	85-94%	Aggressive pruning + warning
critical	≥ 95%	Hyper-aggressive pruning + intervention

Pressure Storage

Location: .opencode/memory-working/<sessionID>_pressure.json

Schema:

{
  sessionID: string;
  level: "safe" | "moderate" | "high" | "critical";
  percentage: number;
  visiblePromptChars: number;
  estimatedLimit: 180000;
  lastChecked: string;
  interventionsSent: number;
}

Intervention Mechanism

When pressure reaches critical (≥95%):

Plugin sends promptAsync() message to agent
Message includes full visible prompt for review
Agent can compress core memory, clear working memory, or continue
Intervention tracked in interventionsSent counter

System Prompt Injection

[Memory Pressure: 87% (high) - 156,600/180,000 chars]
⚠️ High memory pressure detected. Consider:
- Compressing core_memory blocks (use core_memory_update)
- Clearing resolved errors (use working_memory_clear_slot)
- Removing old pool items (auto-pruned at score < 0.01)

Phase 4.5: Storage Governance

Purpose

Prevent .opencode/ directory bloat from accumulating tool output caches and orphaned memory files.

Layer 1: Session Deletion Cleanup

Trigger: experimental.session.deleted hook

Actions:

Remove .opencode/memory-core/<sessionID>.json
Remove .opencode/memory-working/<sessionID>.json
Remove .opencode/memory-working/<sessionID>_pressure.json
Remove .opencode/memory-working/<sessionID>_compaction.json

Layer 2: Tool Output Cache Sweep

Trigger: Every 500 events (eventCounter % 500 === 0)

Target: .opencode/cache/tool-outputs/ directory

Policy:

Keep most recent 300 files (sorted by mtime)
Delete files older than 7 days (TTL policy)

Logging: Write sweep results to .opencode/memory-working/<sessionID>_sweep.json

{
  sessionID: string;
  timestamp: string;
  eventCounter: number;
  results: {
    filesScanned: number;
    filesDeleted: number;
    bytesReclaimed: number;
    errors: Array<string>;
  };
}

Performance Considerations

Memory Budgets

Core Memory: 4,500 chars (injected every message)
Working Memory: ~1,600 chars (injected every message)
Total Overhead: ~6,100 chars per message

Compaction Behavior

When OpenCode compacts conversation (clears old messages):

Core memory: Preserved (persistent across compactions)
Working memory: Preserved (session-scoped, cleared on session end)
Pressure state: Preserved (tracks across compaction)
Compaction log: Saved to <sessionID>_compaction.json

Storage Footprint

Each session: 4 JSON files (~5-20 KB total)
Tool output cache: Max 300 files (~10-50 MB depending on outputs)
Sweep every 500 events keeps storage bounded

Extension Points

Custom Slot Types

To add new slot types:

Update SlotType union in types
Add to SLOT_CONFIG with max items
Update formatWorkingMemoryForPrompt() for display
Update extraction heuristics in tool.execute.after

Custom Pruning Rules

To add pruning heuristics:

Update shouldPrune() with new detection logic
Add to pruneToolOutput() with filtering rules
Test with representative tool outputs

Custom Pressure Thresholds

Adjust in constants:

const PRESSURE_THRESHOLDS = {
  moderate: 70,
  high: 85,
  critical: 95,
};

Migration & Compatibility

Old Format → New Format

Plugin automatically migrates from old format:

// Old format (pre-Phase 3)
{ items: Array<Item> }

// New format (Phase 3+)
{ slots: Record<SlotType, Array<Item>>, pool: Array<Item> }

Migration happens on first load of old format files.

File System Layout

.opencode/
├── memory-core/
│   └── <sessionID>.json          # Core memory blocks
├── memory-working/
│   ├── <sessionID>.json          # Working memory (slots + pool)
│   ├── <sessionID>_pressure.json # Pressure monitoring state
│   ├── <sessionID>_compaction.json # Compaction event log
│   └── <sessionID>_sweep.json    # Storage sweep log
└── cache/
    └── tool-outputs/
        └── *.json                # Tool output cache (auto-swept)

Security Considerations

All files written with 0644 permissions (owner read/write, group/others read)
Directories created with 0755 permissions (owner rwx, group/others rx)
No sensitive data should be stored in memory blocks (user responsibility)
Session IDs are opaque identifiers, not derived from sensitive data

Last Updated: February 2026
Implementation: index.ts (1700+ lines)

12 KiB Raw Blame History Unescape Escape

Architecture Documentation

Overview

Phase 1: Core Memory Foundation

Purpose

Storage

Character Limits

Operations

Tools

System Prompt Injection

Phase 2: Smart Pruning

Purpose

Pruning Modes

Normal Mode (Pressure < 85%)

Aggressive Mode (85% ≤ Pressure < 95%)

Hyper-Aggressive Mode (Pressure ≥ 95%)

Pruning Heuristics

Implementation

Phase 3: Working Memory

Purpose

Storage

Slot Types

Memory Pool

Auto-Extraction

Manual Management

System Prompt Injection

Phase 4: Pressure Monitoring

Purpose

Pressure Calculation

Pressure Levels

Pressure Storage

Intervention Mechanism

System Prompt Injection

Phase 4.5: Storage Governance

Purpose

Layer 1: Session Deletion Cleanup

Layer 2: Tool Output Cache Sweep

Performance Considerations

Memory Budgets

Compaction Behavior

Storage Footprint

Extension Points

Custom Slot Types

Custom Pruning Rules

Custom Pressure Thresholds

Migration & Compatibility

Old Format → New Format

File System Layout

Security Considerations

12 KiB

Raw Blame History