mirror of
https://github.com/sdwolf4103/opencode-working-memory.git
synced 2026-06-02 06:19:36 +02:00
Compare commits
13 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 6603fe869d | |||
| 3d44269228 | |||
| a154139b27 | |||
| 7527765207 | |||
| f9acfd6136 | |||
| ca71c20a8f | |||
| 5e9ada6859 | |||
| 721544e7a8 | |||
| 32fa2bd454 | |||
| af539a42f3 | |||
| eff0d3784c | |||
| 2354b62350 | |||
| 92e90124de |
@@ -1,5 +1,51 @@
|
||||
# Release Notes
|
||||
|
||||
## 1.2.1 (2026-04-26)
|
||||
|
||||
### Compaction Memory Quality — Four-Layer Defense
|
||||
|
||||
This release addresses systemic quality issues in workspace memory: duplicates, stale entries, and silently lost memory candidates. A four-layer defense is now in place:
|
||||
|
||||
```
|
||||
Prompt → Durable-content guidance keeps LLM on factual memories
|
||||
Parser → Accepts bracketless format, filters session snapshots
|
||||
Storage → Entity-key dedup + topic supersession + source priority
|
||||
Staleness → Age-based pruning of obsolete compaction/manual entries
|
||||
```
|
||||
|
||||
### Key Features
|
||||
|
||||
- **Self-cleaning memory**: Entity-key deduplication, topic supersession, and age-based staleness pruning automatically maintain memory quality
|
||||
- **Robust parser**: Accepts both bracketless (`- type text`) and bracketed (`- [type] text`) formats — no more silently lost memories
|
||||
- **Durable-content prompt**: Compaction template now guides LLM toward factual, long-lived memories while explicitly discouraging session ephemera
|
||||
- **Smart snapshot filtering**: Automatically rejects project-type snapshots (file counts, test counts, Phase progress) that don't belong in long-term memory
|
||||
|
||||
### Fixed
|
||||
|
||||
- **Bracketless format bug**: Parser regex only matched `- [type]` pattern; real LLM output often uses `- type` (no brackets). Both formats now accepted. (P0a)
|
||||
- **Purple/italic text in OpenCode UI**: Replaced XML/HTML comment templates with clean Markdown headings. Further hardened with negative instructions to forbid YAML frontmatter. (P0b β)
|
||||
- **Session snapshots polluting memory**: Project entries like "37 個文件", "26 tests pass", "Phase 2 completed" now rejected by parser filter. (P0c)
|
||||
- **Duplicate entries**: Entities deduped by key (e.g., `opencode-agenthub plugin system`). Topic conflicts resolved via supersession: newer shorter facts beat older verbose ones for decisions/feedback. (P0d)
|
||||
- **Stale entries never cleaned**: Compaction/manual entries with `staleAfterDays` now auto-pruned after 30-day grace period.
|
||||
- **Short reference entries rejected**: Admin PIN (`456123`) and config values (`Scrypt n=32768`) now allowed through config value allowlist despite being under 20 chars.
|
||||
|
||||
### Changed
|
||||
|
||||
- **`chooseBetterMemory`**: Now accepts `"entity"` mode (length preferred, for project/reference) and `"supersession"` mode (freshness preferred, for decision/feedback).
|
||||
- **Source priority in sort**: Manual/source priority now included as secondary sort tie-breaker after entry priority.
|
||||
|
||||
### Technical Details
|
||||
|
||||
- **Parser formats**: 4 accepted (plain text label primary, plus Markdown section, legacy section, legacy XML)
|
||||
- **Chinese counter words**: Regex matches `個`/`个` between numbers and nouns (e.g., `37 個文件`)
|
||||
- **Entity keys cautious**: Only known product keys extracted (`opencode-agenthub`); generic config references fall back to canonical text dedup
|
||||
|
||||
### Tests
|
||||
|
||||
- **70/70 tests pass** (24 workspace-memory, 34 extractors, 12 plugin)
|
||||
|
||||
---
|
||||
|
||||
## 1.2.0 (2026-04-26)
|
||||
|
||||
### Memory V2 Architecture
|
||||
|
||||
@@ -0,0 +1,976 @@
|
||||
# Memory V2 Redesign Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Replace the current heavy four-tier memory plugin with a low-token, no-extra-agent-call memory system that provides workspace-scoped long-term memory and session hot state.
|
||||
|
||||
**Architecture:** Implement three layers: stable workspace memory, hot session state, and native OpenCode state integration. Workspace memory is frozen per session and refreshed at compaction boundaries; hot session state tracks active files and unresolved blocking errors automatically from tool events; OpenCode todos remain owned by OpenCode and are only read during compaction.
|
||||
|
||||
**Tech Stack:** TypeScript, OpenCode Plugin hooks, Node/Bun file APIs, JSON sidecar storage under user data directory, TypeScript typecheck via `npm run typecheck`.
|
||||
|
||||
---
|
||||
|
||||
## Design Summary
|
||||
|
||||
### What changes
|
||||
|
||||
- Remove default agent-visible memory tools from the normal flow.
|
||||
- Remove raw tool-output cache and pressure-monitor intervention from the core path.
|
||||
- Add workspace-scoped long-term memory that persists across sessions but does not cross workspaces.
|
||||
- Add hot session state that is fully automatic and tiny: active files, open blocking errors, and recent decisions for compaction only.
|
||||
- Reuse OpenCode compaction to extract long-term memory candidates with no extra LLM call.
|
||||
- Read OpenCode todos during compaction instead of duplicating todo storage.
|
||||
|
||||
### What stays out of memory
|
||||
|
||||
- Long-term memory does **not** save file lists, stack traces, code signatures, API docs, git history, architecture snapshots, or temporary task progress.
|
||||
- Short-term memory does **not** save todos or dependency facts because OpenCode and project files already own those.
|
||||
|
||||
---
|
||||
|
||||
## File Structure
|
||||
|
||||
Current project has a single `index.ts`. This plan splits memory behavior into focused modules while keeping `index.ts` as the plugin entrypoint.
|
||||
|
||||
### Create
|
||||
|
||||
- `src/paths.ts` — computes workspace-scoped storage paths under user data directory.
|
||||
- `src/storage.ts` — atomic JSON read/write helpers with safe defaults.
|
||||
- `src/types.ts` — canonical schemas and constants for long-term memory and session state.
|
||||
- `src/workspace-memory.ts` — load/save/merge/render long-term workspace memory.
|
||||
- `src/session-state.ts` — load/save/update/render active files, open errors, recent decisions.
|
||||
- `src/extractors.ts` — deterministic extraction from user messages, tool args, bash output, and compaction summaries.
|
||||
- `src/opencode.ts` — thin wrappers around OpenCode SDK calls for latest user messages, summaries, and todos.
|
||||
- `src/plugin.ts` — hook orchestration.
|
||||
- `tests/extractors.test.ts` — unit tests for deterministic extraction.
|
||||
- `tests/workspace-memory.test.ts` — unit tests for merge, dedupe, limits, staleness rendering.
|
||||
- `tests/session-state.test.ts` — unit tests for active files and error lifecycle.
|
||||
|
||||
### Modify
|
||||
|
||||
- `index.ts` — replace monolithic implementation with `export { default } from "./src/plugin";`.
|
||||
- `package.json` — add a test script using Node’s built-in test runner or Bun test depending available runtime.
|
||||
- `README.md` — update feature description from four-tier memory to Memory V2.
|
||||
- `docs/architecture.md` — replace stale four-tier docs with three-layer design.
|
||||
- `docs/configuration.md` — document limits and optional debug tools.
|
||||
- `AGENTS.md` — update development guide, storage paths, and testing commands.
|
||||
|
||||
---
|
||||
|
||||
## Wave 1 — Storage, Types, and Deterministic Core
|
||||
|
||||
### Task 1: Add canonical types and limits
|
||||
|
||||
**Files:**
|
||||
- Create: `src/types.ts`
|
||||
|
||||
- [ ] **Step 1: Create memory and session schemas**
|
||||
|
||||
Add this file:
|
||||
|
||||
```ts
|
||||
export type LongTermType = "feedback" | "project" | "decision" | "reference";
|
||||
|
||||
export type LongTermSource = "explicit" | "compaction" | "manual";
|
||||
|
||||
export type LongTermMemoryEntry = {
|
||||
id: string;
|
||||
type: LongTermType;
|
||||
text: string;
|
||||
rationale?: string;
|
||||
source: LongTermSource;
|
||||
confidence: number;
|
||||
status: "active" | "superseded";
|
||||
createdAt: string;
|
||||
updatedAt: string;
|
||||
staleAfterDays?: number;
|
||||
supersedes?: string[];
|
||||
tags?: string[];
|
||||
};
|
||||
|
||||
export type WorkspaceMemoryStore = {
|
||||
version: 1;
|
||||
workspace: {
|
||||
root: string;
|
||||
key: string;
|
||||
};
|
||||
limits: {
|
||||
maxRenderedChars: number;
|
||||
maxEntries: number;
|
||||
};
|
||||
entries: LongTermMemoryEntry[];
|
||||
updatedAt: string;
|
||||
};
|
||||
|
||||
export type ActiveFile = {
|
||||
path: string;
|
||||
action: "read" | "grep" | "edit" | "write";
|
||||
count: number;
|
||||
lastSeen: number;
|
||||
};
|
||||
|
||||
export type OpenError = {
|
||||
id: string;
|
||||
category: "typecheck" | "test" | "lint" | "build" | "runtime" | "tool";
|
||||
summary: string;
|
||||
command?: string;
|
||||
file?: string;
|
||||
fingerprint: string;
|
||||
status: "open" | "maybe_fixed";
|
||||
firstSeen: number;
|
||||
lastSeen: number;
|
||||
seenCount: number;
|
||||
};
|
||||
|
||||
export type SessionDecision = {
|
||||
id: string;
|
||||
text: string;
|
||||
rationale?: string;
|
||||
source: "assistant" | "user" | "compaction";
|
||||
createdAt: number;
|
||||
promotedToLongTerm?: boolean;
|
||||
};
|
||||
|
||||
export type SessionState = {
|
||||
version: 1;
|
||||
sessionID: string;
|
||||
turn: number;
|
||||
updatedAt: string;
|
||||
activeFiles: ActiveFile[];
|
||||
openErrors: OpenError[];
|
||||
recentDecisions: SessionDecision[];
|
||||
};
|
||||
|
||||
export const LONG_TERM_LIMITS = {
|
||||
maxRenderedChars: 5200,
|
||||
targetRenderedChars: 4200,
|
||||
maxEntries: 28,
|
||||
maxEntryTextChars: 260,
|
||||
maxRationaleChars: 180,
|
||||
} as const;
|
||||
|
||||
export const HOT_STATE_LIMITS = {
|
||||
maxRenderedChars: 1200,
|
||||
maxActiveFilesStored: 20,
|
||||
maxActiveFilesRendered: 8,
|
||||
maxOpenErrorsStored: 5,
|
||||
maxOpenErrorsRendered: 3,
|
||||
maxRecentDecisionsStored: 8,
|
||||
} as const;
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run typecheck**
|
||||
|
||||
Run: `npm run typecheck`
|
||||
|
||||
Expected: PASS or existing unrelated failures only. Since file is not imported yet, it should not introduce errors.
|
||||
|
||||
---
|
||||
|
||||
### Task 2: Add workspace-scoped paths and atomic storage
|
||||
|
||||
**Files:**
|
||||
- Create: `src/paths.ts`
|
||||
- Create: `src/storage.ts`
|
||||
|
||||
- [ ] **Step 1: Create `src/paths.ts`**
|
||||
|
||||
```ts
|
||||
import { createHash } from "crypto";
|
||||
import { homedir } from "os";
|
||||
import { join } from "path";
|
||||
import { realpath } from "fs/promises";
|
||||
|
||||
export function dataHome(): string {
|
||||
return process.env.XDG_DATA_HOME ?? join(homedir(), ".local", "share");
|
||||
}
|
||||
|
||||
export async function workspaceKey(root: string): Promise<string> {
|
||||
const resolved = await realpath(root).catch(() => root);
|
||||
return createHash("sha256").update(resolved).digest("hex").slice(0, 16);
|
||||
}
|
||||
|
||||
export async function memoryRoot(root: string): Promise<string> {
|
||||
return join(dataHome(), "opencode-working-memory", "workspaces", await workspaceKey(root));
|
||||
}
|
||||
|
||||
export async function workspaceMemoryPath(root: string): Promise<string> {
|
||||
return join(await memoryRoot(root), "workspace-memory.json");
|
||||
}
|
||||
|
||||
export async function sessionStatePath(root: string, sessionID: string): Promise<string> {
|
||||
return join(await memoryRoot(root), "sessions", `${sessionID}.json`);
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Create `src/storage.ts`**
|
||||
|
||||
```ts
|
||||
import { existsSync } from "fs";
|
||||
import { mkdir, readFile, rename, writeFile } from "fs/promises";
|
||||
import { dirname } from "path";
|
||||
|
||||
export async function readJSON<T>(path: string, fallback: () => T): Promise<T> {
|
||||
if (!existsSync(path)) return fallback();
|
||||
try {
|
||||
return JSON.parse(await readFile(path, "utf8")) as T;
|
||||
} catch {
|
||||
return fallback();
|
||||
}
|
||||
}
|
||||
|
||||
export async function atomicWriteJSON(path: string, data: unknown): Promise<void> {
|
||||
await mkdir(dirname(path), { recursive: true });
|
||||
const tmp = `${path}.${process.pid}.${Date.now()}.tmp`;
|
||||
await writeFile(tmp, JSON.stringify(data, null, 2), { encoding: "utf8", mode: 0o600 });
|
||||
await rename(tmp, path);
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Run typecheck**
|
||||
|
||||
Run: `npm run typecheck`
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
---
|
||||
|
||||
### Task 3: Add extractor tests before implementation
|
||||
|
||||
**Files:**
|
||||
- Create: `tests/extractors.test.ts`
|
||||
- Modify: `package.json`
|
||||
|
||||
- [ ] **Step 1: Add test script**
|
||||
|
||||
Modify `package.json` scripts:
|
||||
|
||||
```json
|
||||
{
|
||||
"scripts": {
|
||||
"build": "node -e \"console.log('No build step required: OpenCode loads index.ts directly')\"",
|
||||
"typecheck": "tsc --noEmit",
|
||||
"test": "node --test --experimental-strip-types tests/*.test.ts"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Write failing tests**
|
||||
|
||||
Create `tests/extractors.test.ts`:
|
||||
|
||||
```ts
|
||||
import test from "node:test";
|
||||
import assert from "node:assert/strict";
|
||||
import {
|
||||
extractExplicitMemories,
|
||||
extractActiveFiles,
|
||||
extractErrorsFromBash,
|
||||
parseWorkspaceMemoryCandidates,
|
||||
} from "../src/extractors.ts";
|
||||
|
||||
test("extractExplicitMemories captures clear remember instruction", () => {
|
||||
const items = extractExplicitMemories("请记住:这个 workspace 的 memory 功能必须默认无感");
|
||||
assert.equal(items.length, 1);
|
||||
assert.equal(items[0].type, "feedback");
|
||||
assert.match(items[0].text, /默认无感/);
|
||||
});
|
||||
|
||||
test("extractExplicitMemories avoids casual negative commands", () => {
|
||||
assert.equal(extractExplicitMemories("不要吃这个").length, 0);
|
||||
assert.equal(extractExplicitMemories("以后再说").length, 0);
|
||||
});
|
||||
|
||||
test("extractActiveFiles uses tool args before output", () => {
|
||||
assert.deepEqual(extractActiveFiles("read", { filePath: "/repo/index.ts" }, "random content"), [
|
||||
{ path: "/repo/index.ts", action: "read" },
|
||||
]);
|
||||
});
|
||||
|
||||
test("extractErrorsFromBash captures typecheck failure", () => {
|
||||
const errors = extractErrorsFromBash("npm run typecheck", "src/index.ts(10,3): error TS2345: bad type");
|
||||
assert.equal(errors.length, 1);
|
||||
assert.equal(errors[0].category, "typecheck");
|
||||
assert.match(errors[0].summary, /TS2345/);
|
||||
});
|
||||
|
||||
test("parseWorkspaceMemoryCandidates parses compaction block", () => {
|
||||
const entries = parseWorkspaceMemoryCandidates(`summary
|
||||
<workspace_memory_candidates>
|
||||
- [decision] Use JSON as canonical storage because it is easier to validate.
|
||||
- [reference] External design notes are in Notion.
|
||||
</workspace_memory_candidates>`);
|
||||
assert.equal(entries.length, 2);
|
||||
assert.equal(entries[0].type, "decision");
|
||||
assert.equal(entries[1].type, "reference");
|
||||
});
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Run tests and confirm failure**
|
||||
|
||||
Run: `npm test`
|
||||
|
||||
Expected: FAIL because `src/extractors.ts` does not exist.
|
||||
|
||||
---
|
||||
|
||||
### Task 4: Implement deterministic extractors
|
||||
|
||||
**Files:**
|
||||
- Create: `src/extractors.ts`
|
||||
|
||||
- [ ] **Step 1: Add extractor implementation**
|
||||
|
||||
```ts
|
||||
import { createHash } from "crypto";
|
||||
import type { ActiveFile, LongTermMemoryEntry, LongTermType, OpenError } from "./types";
|
||||
import { LONG_TERM_LIMITS } from "./types";
|
||||
|
||||
function id(prefix: string): string {
|
||||
return `${prefix}_${Date.now()}_${Math.random().toString(36).slice(2, 8)}`;
|
||||
}
|
||||
|
||||
function hash(value: string): string {
|
||||
return createHash("sha1").update(value).digest("hex").slice(0, 12);
|
||||
}
|
||||
|
||||
export function extractExplicitMemories(text: string): LongTermMemoryEntry[] {
|
||||
const patterns = [
|
||||
/(?:请记住|記住|记住这一点|remember this|commit to memory)[::]?\s*(.+)$/im,
|
||||
/(?:从现在开始|從現在開始|从今以后|從今以後|from now on|always)[::]?\s*(.+)$/im,
|
||||
];
|
||||
|
||||
const now = new Date().toISOString();
|
||||
const entries: LongTermMemoryEntry[] = [];
|
||||
|
||||
for (const pattern of patterns) {
|
||||
const match = text.match(pattern);
|
||||
const body = match?.[1]?.trim();
|
||||
if (!body || body.length < 8) continue;
|
||||
if (/^(再说|再說|later|next time)$/i.test(body)) continue;
|
||||
|
||||
entries.push({
|
||||
id: id("mem"),
|
||||
type: classifyExplicitMemory(body),
|
||||
text: body.slice(0, LONG_TERM_LIMITS.maxEntryTextChars),
|
||||
source: "explicit",
|
||||
confidence: 1,
|
||||
status: "active",
|
||||
createdAt: now,
|
||||
updatedAt: now,
|
||||
staleAfterDays: staleAfterDaysFor(classifyExplicitMemory(body)),
|
||||
});
|
||||
}
|
||||
|
||||
return entries;
|
||||
}
|
||||
|
||||
function classifyExplicitMemory(text: string): LongTermType {
|
||||
const lower = text.toLowerCase();
|
||||
if (/https?:\/\/|linear|slack|notion|dashboard|grafana/.test(lower)) return "reference";
|
||||
if (/decide|decision|choose|chosen|决定|決定|选择|選擇/.test(lower)) return "decision";
|
||||
if (/project|workspace|repo|项目|專案/.test(lower)) return "project";
|
||||
return "feedback";
|
||||
}
|
||||
|
||||
export function staleAfterDaysFor(type: LongTermType): number | undefined {
|
||||
if (type === "feedback") return undefined;
|
||||
if (type === "decision") return 45;
|
||||
if (type === "project") return 60;
|
||||
return 90;
|
||||
}
|
||||
|
||||
export function extractActiveFiles(
|
||||
toolName: string,
|
||||
args: Record<string, unknown>,
|
||||
output: string,
|
||||
): Array<{ path: string; action: ActiveFile["action"] }> {
|
||||
if (toolName === "read" && typeof args.filePath === "string") return [{ path: args.filePath, action: "read" }];
|
||||
if (toolName === "edit" && typeof args.filePath === "string") return [{ path: args.filePath, action: "edit" }];
|
||||
if (toolName === "write" && typeof args.filePath === "string") return [{ path: args.filePath, action: "write" }];
|
||||
if (toolName === "grep") return extractGrepPaths(output).map(path => ({ path, action: "grep" as const }));
|
||||
return [];
|
||||
}
|
||||
|
||||
function extractGrepPaths(output: string): string[] {
|
||||
const matches = output.match(/^(\/[^
|
||||
return [...new Set(matches.map(match => match.replace(/:$/, "")))].slice(0, 10);
|
||||
}
|
||||
|
||||
export function extractErrorsFromBash(command: string, output: string): OpenError[] {
|
||||
const lines = output.split("\n").filter(line => /error|failed|failure|exception|TS\d{4}|ERR!/i.test(line)).slice(0, 5);
|
||||
if (lines.length === 0) return [];
|
||||
|
||||
const category = classifyCommand(command) ?? "runtime";
|
||||
const summary = lines.join(" ").slice(0, 280);
|
||||
const fingerprint = hash(`${category}:${summary.toLowerCase().replace(/\s+/g, " ")}`);
|
||||
const now = Date.now();
|
||||
|
||||
return [{
|
||||
id: `err_${fingerprint}`,
|
||||
category,
|
||||
summary,
|
||||
command,
|
||||
file: extractFirstPath(summary),
|
||||
fingerprint,
|
||||
status: "open",
|
||||
firstSeen: now,
|
||||
lastSeen: now,
|
||||
seenCount: 1,
|
||||
}];
|
||||
}
|
||||
|
||||
export function classifyCommand(command: string): OpenError["category"] | null {
|
||||
const c = command.toLowerCase();
|
||||
if (/\b(tsc|typecheck)\b/.test(c)) return "typecheck";
|
||||
if (/\b(test|vitest|jest|mocha|pytest|go test|cargo test)\b/.test(c)) return "test";
|
||||
if (/\b(lint|eslint|biome)\b/.test(c)) return "lint";
|
||||
if (/\b(build|vite build|webpack|tsup)\b/.test(c)) return "build";
|
||||
return null;
|
||||
}
|
||||
|
||||
function extractFirstPath(text: string): string | undefined {
|
||||
return text.match(/[\w./-]+\.(ts|tsx|js|jsx|json|md|py|go|rs)/)?.[0];
|
||||
}
|
||||
|
||||
export function parseWorkspaceMemoryCandidates(summary: string): LongTermMemoryEntry[] {
|
||||
const match = summary.match(/<workspace_memory_candidates>([\s\S]*?)<\/workspace_memory_candidates>/i);
|
||||
if (!match) return [];
|
||||
|
||||
const now = new Date().toISOString();
|
||||
const entries: LongTermMemoryEntry[] = [];
|
||||
|
||||
for (const line of match[1].split("\n")) {
|
||||
const item = line.trim().match(/^-\s*\[(feedback|project|decision|reference)\]\s*(.+)$/i);
|
||||
if (!item) continue;
|
||||
const type = item[1].toLowerCase() as LongTermType;
|
||||
const body = item[2].trim();
|
||||
if (body.length < 12) continue;
|
||||
entries.push({
|
||||
id: id("mem"),
|
||||
type,
|
||||
text: body.slice(0, LONG_TERM_LIMITS.maxEntryTextChars),
|
||||
source: "compaction",
|
||||
confidence: 0.75,
|
||||
status: "active",
|
||||
createdAt: now,
|
||||
updatedAt: now,
|
||||
staleAfterDays: staleAfterDaysFor(type),
|
||||
});
|
||||
}
|
||||
|
||||
return entries;
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run extractor tests**
|
||||
|
||||
Run: `npm test`
|
||||
|
||||
Expected: PASS for extractor tests.
|
||||
|
||||
---
|
||||
|
||||
### Wave 1 verification checkpoint
|
||||
|
||||
- [ ] **Step 1: Run all checks**
|
||||
|
||||
Run: `npm test && npm run typecheck`
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 2: Review wave output**
|
||||
|
||||
Confirm: Types, paths, storage helpers, and deterministic extractors exist and tests cover clear remember, false positives, active files, bash errors, and compaction candidates.
|
||||
|
||||
- [ ] **Step 3: Commit wave**
|
||||
|
||||
```bash
|
||||
git add package.json src tests
|
||||
git commit -m "refactor: add memory v2 core primitives"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Wave 2 — Workspace Memory and Hot Session State
|
||||
|
||||
### Task 5: Implement workspace memory store
|
||||
|
||||
**Files:**
|
||||
- Create: `src/workspace-memory.ts`
|
||||
- Test: `tests/workspace-memory.test.ts`
|
||||
|
||||
- [ ] **Step 1: Write failing tests**
|
||||
|
||||
Create `tests/workspace-memory.test.ts`:
|
||||
|
||||
```ts
|
||||
import test from "node:test";
|
||||
import assert from "node:assert/strict";
|
||||
import type { LongTermMemoryEntry } from "../src/types.ts";
|
||||
import { enforceLongTermLimits, renderWorkspaceMemory } from "../src/workspace-memory.ts";
|
||||
|
||||
function entry(text: string, type: LongTermMemoryEntry["type"] = "feedback"): LongTermMemoryEntry {
|
||||
const now = new Date().toISOString();
|
||||
return { id: text, type, text, source: "explicit", confidence: 1, status: "active", createdAt: now, updatedAt: now };
|
||||
}
|
||||
|
||||
test("enforceLongTermLimits dedupes entries", () => {
|
||||
const kept = enforceLongTermLimits([entry("Memory must be invisible"), entry("Memory must be invisible")]);
|
||||
assert.equal(kept.length, 1);
|
||||
});
|
||||
|
||||
test("renderWorkspaceMemory includes verify marker for stale decisions", () => {
|
||||
const old = entry("Use JSON storage", "decision");
|
||||
old.createdAt = "2020-01-01T00:00:00.000Z";
|
||||
old.staleAfterDays = 45;
|
||||
const rendered = renderWorkspaceMemory({ version: 1, workspace: { root: "/repo", key: "abc" }, limits: { maxRenderedChars: 5200, maxEntries: 28 }, entries: [old], updatedAt: old.createdAt });
|
||||
assert.match(rendered, /verify/);
|
||||
});
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Implement workspace memory functions**
|
||||
|
||||
Create `src/workspace-memory.ts` with:
|
||||
|
||||
```ts
|
||||
import type { LongTermMemoryEntry, WorkspaceMemoryStore } from "./types";
|
||||
import { LONG_TERM_LIMITS } from "./types";
|
||||
import { workspaceKey, workspaceMemoryPath } from "./paths";
|
||||
import { atomicWriteJSON, readJSON } from "./storage";
|
||||
|
||||
export async function emptyWorkspaceMemory(root: string): Promise<WorkspaceMemoryStore> {
|
||||
return {
|
||||
version: 1,
|
||||
workspace: { root, key: await workspaceKey(root) },
|
||||
limits: { maxRenderedChars: LONG_TERM_LIMITS.maxRenderedChars, maxEntries: LONG_TERM_LIMITS.maxEntries },
|
||||
entries: [],
|
||||
updatedAt: new Date().toISOString(),
|
||||
};
|
||||
}
|
||||
|
||||
export async function loadWorkspaceMemory(root: string): Promise<WorkspaceMemoryStore> {
|
||||
return readJSON(await workspaceMemoryPath(root), () => ({
|
||||
version: 1,
|
||||
workspace: { root, key: "unknown" },
|
||||
limits: { maxRenderedChars: LONG_TERM_LIMITS.maxRenderedChars, maxEntries: LONG_TERM_LIMITS.maxEntries },
|
||||
entries: [],
|
||||
updatedAt: new Date().toISOString(),
|
||||
}));
|
||||
}
|
||||
|
||||
export async function saveWorkspaceMemory(root: string, store: WorkspaceMemoryStore): Promise<void> {
|
||||
store.workspace = { root, key: await workspaceKey(root) };
|
||||
store.entries = enforceLongTermLimits(store.entries);
|
||||
store.updatedAt = new Date().toISOString();
|
||||
await atomicWriteJSON(await workspaceMemoryPath(root), store);
|
||||
}
|
||||
|
||||
export function enforceLongTermLimits(entries: LongTermMemoryEntry[]): LongTermMemoryEntry[] {
|
||||
const byKey = new Map<string, LongTermMemoryEntry>();
|
||||
for (const entry of entries.filter(e => e.status === "active")) {
|
||||
const text = entry.text.slice(0, LONG_TERM_LIMITS.maxEntryTextChars);
|
||||
const key = `${entry.type}:${text.toLowerCase().replace(/\s+/g, " ").trim()}`;
|
||||
const existing = byKey.get(key);
|
||||
if (!existing || entry.source === "explicit") byKey.set(key, { ...entry, text });
|
||||
}
|
||||
return [...byKey.values()]
|
||||
.sort((a, b) => priority(b) - priority(a))
|
||||
.slice(0, LONG_TERM_LIMITS.maxEntries);
|
||||
}
|
||||
|
||||
function priority(entry: LongTermMemoryEntry): number {
|
||||
const type = { feedback: 400, decision: 300, project: 200, reference: 100 }[entry.type];
|
||||
const source = entry.source === "explicit" ? 1000 : 0;
|
||||
return source + type + entry.confidence * 10;
|
||||
}
|
||||
|
||||
export function renderWorkspaceMemory(store: WorkspaceMemoryStore): string {
|
||||
const active = enforceLongTermLimits(store.entries);
|
||||
if (active.length === 0) return "";
|
||||
const lines = [
|
||||
"<workspace_memory>",
|
||||
"Persistent workspace memory. Use as background; verify stale or code-related claims.",
|
||||
];
|
||||
for (const type of ["feedback", "project", "decision", "reference"] as const) {
|
||||
const items = active.filter(e => e.type === type);
|
||||
if (items.length === 0) continue;
|
||||
lines.push(`${type}:`);
|
||||
for (const item of items) lines.push(`- ${renderEntry(item)}`);
|
||||
}
|
||||
lines.push("</workspace_memory>");
|
||||
return lines.join("\n").slice(0, store.limits.maxRenderedChars);
|
||||
}
|
||||
|
||||
function renderEntry(entry: LongTermMemoryEntry): string {
|
||||
const ageDays = Math.floor((Date.now() - new Date(entry.createdAt).getTime()) / 86_400_000);
|
||||
const stale = entry.staleAfterDays && ageDays > entry.staleAfterDays ? ` [${ageDays}d old, verify]` : "";
|
||||
const rationale = entry.rationale ? ` Why: ${entry.rationale.slice(0, LONG_TERM_LIMITS.maxRationaleChars)}` : "";
|
||||
return `${entry.text}${rationale}${stale}`;
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Run tests**
|
||||
|
||||
Run: `npm test`
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
---
|
||||
|
||||
### Task 6: Implement session state lifecycle
|
||||
|
||||
**Files:**
|
||||
- Create: `src/session-state.ts`
|
||||
- Test: `tests/session-state.test.ts`
|
||||
|
||||
- [ ] **Step 1: Write failing tests**
|
||||
|
||||
Create `tests/session-state.test.ts`:
|
||||
|
||||
```ts
|
||||
import test from "node:test";
|
||||
import assert from "node:assert/strict";
|
||||
import { createEmptySessionState, touchActiveFile, upsertOpenError, clearErrorsForSuccessfulCommand, renderHotSessionState } from "../src/session-state.ts";
|
||||
import type { OpenError } from "../src/types.ts";
|
||||
|
||||
test("touchActiveFile weights edits above reads", () => {
|
||||
const state = createEmptySessionState("s1");
|
||||
touchActiveFile(state, "/repo/a.ts", "read");
|
||||
touchActiveFile(state, "/repo/b.ts", "edit");
|
||||
assert.equal(state.activeFiles[0].path, "/repo/b.ts");
|
||||
});
|
||||
|
||||
test("clearErrorsForSuccessfulCommand clears category", () => {
|
||||
const state = createEmptySessionState("s1");
|
||||
const err: OpenError = { id: "e", category: "typecheck", summary: "TS error", fingerprint: "f", status: "open", firstSeen: 1, lastSeen: 1, seenCount: 1 };
|
||||
upsertOpenError(state, err);
|
||||
clearErrorsForSuccessfulCommand(state, "npm run typecheck");
|
||||
assert.equal(state.openErrors.length, 0);
|
||||
});
|
||||
|
||||
test("renderHotSessionState includes active files and open errors", () => {
|
||||
const state = createEmptySessionState("s1");
|
||||
touchActiveFile(state, "/repo/index.ts", "edit");
|
||||
upsertOpenError(state, { id: "e", category: "test", summary: "test failed", fingerprint: "f", status: "open", firstSeen: 1, lastSeen: 1, seenCount: 1 });
|
||||
const rendered = renderHotSessionState(state, "/repo");
|
||||
assert.match(rendered, /index.ts/);
|
||||
assert.match(rendered, /test failed/);
|
||||
});
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Implement session state functions**
|
||||
|
||||
Create `src/session-state.ts` with create/load/save/touch/upsert/clear/render functions matching the tests.
|
||||
|
||||
- [ ] **Step 3: Run tests**
|
||||
|
||||
Run: `npm test`
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
---
|
||||
|
||||
### Wave 2 verification checkpoint
|
||||
|
||||
- [ ] **Step 1: Run all checks**
|
||||
|
||||
Run: `npm test && npm run typecheck`
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 2: Review wave output**
|
||||
|
||||
Confirm: Long-term store enforces limits and renders staleness. Hot session state ranks active files, stores open errors, and clears category errors on successful validation commands.
|
||||
|
||||
- [ ] **Step 3: Commit wave**
|
||||
|
||||
```bash
|
||||
git add src tests
|
||||
git commit -m "feat: add workspace memory and hot session state"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Wave 3 — Plugin Hook Integration
|
||||
|
||||
### Task 7: Wire OpenCode helper functions
|
||||
|
||||
**Files:**
|
||||
- Create: `src/opencode.ts`
|
||||
|
||||
- [ ] **Step 1: Add SDK wrappers**
|
||||
|
||||
Create `src/opencode.ts` with helpers:
|
||||
|
||||
```ts
|
||||
export async function latestUserText(client: any, sessionID: string): Promise<{ id: string; text: string } | null> {
|
||||
const result = await client.session.messages({ path: { id: sessionID } });
|
||||
const messages = result.data ?? [];
|
||||
for (let i = messages.length - 1; i >= 0; i--) {
|
||||
const msg = messages[i];
|
||||
if (msg.info?.role !== "user") continue;
|
||||
const text = msg.parts?.filter((p: any) => p.type === "text").map((p: any) => p.text).join("\n") ?? "";
|
||||
if (text.trim()) return { id: msg.info.id, text };
|
||||
}
|
||||
return null;
|
||||
}
|
||||
|
||||
export async function latestCompactionSummary(client: any, sessionID: string): Promise<string | null> {
|
||||
const result = await client.session.messages({ path: { id: sessionID } });
|
||||
const messages = result.data ?? [];
|
||||
for (let i = messages.length - 1; i >= 0; i--) {
|
||||
const msg = messages[i];
|
||||
if (msg.info?.role !== "assistant" || msg.info?.summary !== true) continue;
|
||||
const text = msg.parts?.filter((p: any) => p.type === "text").map((p: any) => p.text).join("\n") ?? "";
|
||||
if (text.trim()) return text;
|
||||
}
|
||||
return null;
|
||||
}
|
||||
|
||||
export async function pendingTodos(client: any, sessionID: string): Promise<Array<{ content: string; status: string; priority?: string }>> {
|
||||
try {
|
||||
const result = await client.session.todo({ path: { id: sessionID } });
|
||||
return (result.data ?? []).filter((todo: any) => todo.status !== "completed");
|
||||
} catch {
|
||||
return [];
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run typecheck**
|
||||
|
||||
Run: `npm run typecheck`
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
---
|
||||
|
||||
### Task 8: Implement plugin orchestration
|
||||
|
||||
**Files:**
|
||||
- Create: `src/plugin.ts`
|
||||
- Modify: `index.ts`
|
||||
|
||||
- [ ] **Step 1: Replace `index.ts` entrypoint**
|
||||
|
||||
```ts
|
||||
export { default } from "./src/plugin";
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Implement hooks in `src/plugin.ts`**
|
||||
|
||||
Create plugin that:
|
||||
|
||||
- caches frozen workspace memory per `sessionID`
|
||||
- processes explicit memory from latest user text once per message id
|
||||
- injects frozen workspace memory and dynamic hot session state
|
||||
- updates session state after tools
|
||||
- augments compaction context with memory, hot state, todos, and memory candidate instruction
|
||||
- parses compaction summaries from `session.compacted` event and merges candidates
|
||||
|
||||
The compaction instruction must be:
|
||||
|
||||
```ts
|
||||
function memoryCandidateInstruction(): string {
|
||||
return `
|
||||
At the end of the compaction summary, include:
|
||||
|
||||
<workspace_memory_candidates>
|
||||
- [feedback] ...
|
||||
- [project] ...
|
||||
- [decision] ...
|
||||
- [reference] ...
|
||||
</workspace_memory_candidates>
|
||||
|
||||
Only include durable information useful across future sessions in this exact workspace.
|
||||
Do NOT include active file lists, raw errors, temporary progress, stack traces, code signatures, API docs, git history, or facts easily rediscovered from the repository.
|
||||
For decisions, include rationale in one sentence.
|
||||
If nothing qualifies, output an empty block.
|
||||
`.trim();
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Run typecheck**
|
||||
|
||||
Run: `npm run typecheck`
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
---
|
||||
|
||||
### Wave 3 verification checkpoint
|
||||
|
||||
- [ ] **Step 1: Run all checks**
|
||||
|
||||
Run: `npm test && npm run typecheck`
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 2: Manual plugin smoke test**
|
||||
|
||||
Run OpenCode with local plugin and verify:
|
||||
|
||||
- user message `请记住:这个 workspace 的 memory 功能要默认无感` creates a long-term entry
|
||||
- reading/editing files updates hot session state
|
||||
- failed typecheck creates an open error
|
||||
- successful typecheck clears typecheck errors
|
||||
|
||||
- [ ] **Step 3: Commit wave**
|
||||
|
||||
```bash
|
||||
git add index.ts src tests
|
||||
git commit -m "feat: wire memory v2 plugin hooks"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Wave 4 — Documentation and Migration
|
||||
|
||||
### Task 9: Update documentation
|
||||
|
||||
**Files:**
|
||||
- Modify: `README.md`
|
||||
- Modify: `docs/architecture.md`
|
||||
- Modify: `docs/configuration.md`
|
||||
- Modify: `AGENTS.md`
|
||||
|
||||
- [ ] **Step 1: Update README feature summary**
|
||||
|
||||
Describe Memory V2 as:
|
||||
|
||||
- workspace-scoped long-term memory
|
||||
- hot session state
|
||||
- no default agent-visible memory tools
|
||||
- no raw tool-output cache
|
||||
- compaction boundary extraction with no extra LLM call
|
||||
|
||||
- [ ] **Step 2: Update architecture doc**
|
||||
|
||||
Replace four-tier architecture with:
|
||||
|
||||
```text
|
||||
Layer 1: Stable Workspace Memory
|
||||
Layer 2: Hot Session State
|
||||
Layer 3: Native OpenCode State
|
||||
```
|
||||
|
||||
- [ ] **Step 3: Update configuration doc**
|
||||
|
||||
Document:
|
||||
|
||||
- `LONG_TERM_LIMITS`
|
||||
- `HOT_STATE_LIMITS`
|
||||
- storage root under `XDG_DATA_HOME` or `~/.local/share`
|
||||
- optional future `/memory import`
|
||||
|
||||
- [ ] **Step 4: Update AGENTS.md**
|
||||
|
||||
Update commands:
|
||||
|
||||
```bash
|
||||
npm test
|
||||
npm run typecheck
|
||||
```
|
||||
|
||||
Update storage and testing guidance to match Memory V2.
|
||||
|
||||
---
|
||||
|
||||
### Task 10: Remove obsolete implementation paths
|
||||
|
||||
**Files:**
|
||||
- Modify: `index.ts` if old code remains
|
||||
- Modify: docs references if any still mention old APIs
|
||||
|
||||
- [ ] **Step 1: Remove obsolete references**
|
||||
|
||||
Ensure repo no longer advertises default tools:
|
||||
|
||||
- `core_memory_update`
|
||||
- `core_memory_read`
|
||||
- `working_memory_add`
|
||||
- `working_memory_clear`
|
||||
- `working_memory_clear_slot`
|
||||
- `working_memory_remove`
|
||||
|
||||
Unless a debug-only compatibility layer is explicitly retained, these names must not appear in README or architecture docs.
|
||||
|
||||
- [ ] **Step 2: Remove obsolete concepts from docs**
|
||||
|
||||
Remove or mark deprecated:
|
||||
|
||||
- slots/pool/decay
|
||||
- pressure monitor as core feature
|
||||
- raw tool-output cache
|
||||
- smart pruning replacing old tool outputs
|
||||
|
||||
- [ ] **Step 3: Run docs grep**
|
||||
|
||||
Run: `grep -R "core_memory_update\|working_memory_add\|pressure monitor\|tool-output cache" README.md docs AGENTS.md`
|
||||
|
||||
Expected: no matches, or matches only under a clearly marked migration note.
|
||||
|
||||
---
|
||||
|
||||
### Wave 4 verification checkpoint
|
||||
|
||||
- [ ] **Step 1: Run all checks**
|
||||
|
||||
Run: `npm test && npm run typecheck`
|
||||
|
||||
Expected: PASS.
|
||||
|
||||
- [ ] **Step 2: Verify docs match code**
|
||||
|
||||
Confirm: README, architecture, configuration, and AGENTS describe Memory V2 and do not promise old tools or old four-tier behavior.
|
||||
|
||||
- [ ] **Step 3: Commit wave**
|
||||
|
||||
```bash
|
||||
git add README.md docs AGENTS.md index.ts src tests package.json
|
||||
git commit -m "docs: document memory v2 design"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Verification Strategy
|
||||
|
||||
### Automated
|
||||
|
||||
- `npm test` validates extractors, long-term merge/render, and hot session lifecycle.
|
||||
- `npm run typecheck` validates TypeScript imports and plugin entrypoint.
|
||||
|
||||
### Manual OpenCode smoke tests
|
||||
|
||||
1. Start a session with the plugin enabled.
|
||||
2. Send: `请记住:这个 workspace 的 memory 功能要默认无感`.
|
||||
3. Confirm `workspace-memory.json` is written under `~/.local/share/opencode-working-memory/workspaces/<hash>/`.
|
||||
4. Read and edit a file.
|
||||
5. Confirm session state active files update.
|
||||
6. Run a failing typecheck command.
|
||||
7. Confirm open error appears in hot state.
|
||||
8. Run a passing typecheck command.
|
||||
9. Confirm typecheck error clears.
|
||||
10. Trigger or simulate compaction.
|
||||
11. Confirm compaction context includes memory candidate instruction and parsed candidates merge after compaction.
|
||||
|
||||
---
|
||||
|
||||
## Risk Controls
|
||||
|
||||
- **False memory extraction:** explicit regex only matches strong remember/from-now-on phrasing; compaction extraction uses explicit “what not to save” boundaries.
|
||||
- **Token overhead:** no background LLM agent; compaction extraction piggybacks existing compaction call; hot state capped at 1200 chars.
|
||||
- **Stale memory:** decision/project/reference entries have stale markers during render.
|
||||
- **Privacy:** storage lives in user data directory, not repo, and writes with `0600` mode.
|
||||
- **Duplicate todo state:** todos are not stored by the plugin; OpenCode remains source of truth.
|
||||
- **Error staleness:** errors clear only after successful validation commands and become `maybe_fixed` after related edits.
|
||||
|
||||
---
|
||||
|
||||
## Self-Review
|
||||
|
||||
- Spec coverage: plan implements workspace-scoped cross-session memory, bounded long-term memory, compaction-boundary update, fully automatic hot session memory, and no extra LLM calls.
|
||||
- Placeholder scan: plan contains no TBD/TODO placeholders; Tasks 8-10 reference exact expected behavior and code boundaries.
|
||||
- Type consistency: `LongTermMemoryEntry`, `WorkspaceMemoryStore`, `SessionState`, `ActiveFile`, `OpenError`, and `SessionDecision` are defined once in Task 1 and reused consistently.
|
||||
- Wave coherence: each wave ends with tests/typecheck and a committable checkpoint.
|
||||
@@ -0,0 +1,815 @@
|
||||
# Memory Deduplication and Staleness Analysis
|
||||
|
||||
Date: 2026-04-26
|
||||
|
||||
## Executive recommendation
|
||||
|
||||
Fix this at storage time first, then tighten ingestion prompts.
|
||||
|
||||
Storage is the safety net. Every memory entry, whether from compaction, explicit user instruction, or future manual editing, already flows through `normalizeWorkspaceMemory()` in `src/workspace-memory.ts`. That is the right architectural choke point for deduplication, supersession, and lifecycle pruning.
|
||||
|
||||
Prompt changes are still useful, but only as a quality reducer. They cannot be the source of truth because model output will drift, multilingual phrasing will vary, and old stores already contain bad entries.
|
||||
|
||||
Do not add embeddings yet. This repo has 22 entries, a limit of 28, and all current failures are simple lexical/category problems. Embeddings would add latency, dependencies, nondeterminism, and storage shape questions for a problem that can be solved with boring code.
|
||||
|
||||
## Current data flow
|
||||
|
||||
```text
|
||||
OpenCode session.compacted event
|
||||
│
|
||||
▼
|
||||
latestCompactionSummary(client, sessionID)
|
||||
│
|
||||
▼
|
||||
parseWorkspaceMemoryCandidates(summary)
|
||||
│ src/extractors.ts
|
||||
│ - validates shape and basic quality
|
||||
│ - assigns type/source/confidence/staleAfterDays
|
||||
▼
|
||||
updateWorkspaceMemory(directory, store => {
|
||||
store.entries.push(...candidates)
|
||||
})
|
||||
│
|
||||
▼
|
||||
normalizeWorkspaceMemory(root, store)
|
||||
│ src/workspace-memory.ts
|
||||
│ - exact canonical dedupe only
|
||||
│ - maxEntries trim
|
||||
▼
|
||||
workspace-memory.json
|
||||
```
|
||||
|
||||
The broken boundary is clear: ingestion appends all candidates, and normalization only dedupes exact normalized text per type.
|
||||
|
||||
## Problem 1: near-duplicate accumulation
|
||||
|
||||
### Diagnosis
|
||||
|
||||
`canonicalMemoryText()` catches only exact matches after NFKC, lowercase, and punctuation/whitespace collapse. It does not catch:
|
||||
|
||||
- same fact with extra location detail
|
||||
- same path with slightly different label text
|
||||
- same decision revised from version 3 to version 4
|
||||
- bilingual restatements of the same project fact
|
||||
- new fix superseding an older fix for the same issue
|
||||
|
||||
This is not one dedupe problem. It is three different classes wearing the same hat.
|
||||
|
||||
```text
|
||||
Near duplicate classes
|
||||
────────────────────────────────────────────
|
||||
project/reference → entity identity problem
|
||||
feedback → topic preference/result problem
|
||||
decision → supersession/history problem
|
||||
```
|
||||
|
||||
Treating all of these with one fuzzy text threshold will either miss real duplicates or delete useful distinct decisions.
|
||||
|
||||
### Ingestion time vs storage time
|
||||
|
||||
Use both, with different jobs.
|
||||
|
||||
#### Storage time, required
|
||||
|
||||
Add deterministic memory normalization in `src/workspace-memory.ts`:
|
||||
|
||||
1. exact canonical dedupe, keep existing behavior
|
||||
2. type-specific identity keys for obvious entities
|
||||
3. simple lexical similarity for same-type candidates
|
||||
4. explicit supersession rules for versioned/solution-style decisions
|
||||
5. lifecycle pruning before `maxEntries` trim
|
||||
|
||||
Why storage first:
|
||||
|
||||
- one code path for compaction, explicit, manual, and tests
|
||||
- fixes existing stores on next load/save
|
||||
- deterministic and unit-testable
|
||||
- does not depend on model behavior
|
||||
|
||||
#### Ingestion time, useful but secondary
|
||||
|
||||
Improve `buildCompactionPrompt()` in `src/plugin.ts` so compaction receives existing memory and is told to emit only new or replacing facts.
|
||||
|
||||
The current prompt already passes rendered workspace memory as background context and says "Do not output this context verbatim." That is not strong enough. Add a small rule near `Memory candidates:`:
|
||||
|
||||
```text
|
||||
Before emitting a memory candidate, compare it to Background context.
|
||||
Do not emit a candidate that repeats an existing memory.
|
||||
If a new candidate replaces an older one, write only the newer statement.
|
||||
Prefer one canonical statement per project fact, reference path, user feedback topic, or implementation decision.
|
||||
```
|
||||
|
||||
This will reduce noise. It will not eliminate it. Models repeat themselves. Software should expect this.
|
||||
|
||||
### Recommended deduplication strategy
|
||||
|
||||
Use deterministic, type-aware dedupe. Avoid embeddings. Avoid global fuzzy dedupe as the main rule.
|
||||
|
||||
#### 1. Keep exact canonical dedupe
|
||||
|
||||
Current logic is good as the first pass.
|
||||
|
||||
```ts
|
||||
dedup key = `${entry.type}:${canonicalMemoryText(text)}`
|
||||
```
|
||||
|
||||
Keep source/confidence tie-breaking.
|
||||
|
||||
#### 2. Add type-specific identity extraction
|
||||
|
||||
For `project` and `reference`, dedupe by identifiable anchors, not prose.
|
||||
|
||||
Examples:
|
||||
|
||||
- repo/plugin system facts: normalized phrase key like `opencode-agenthub plugin system`
|
||||
- file paths: normalized path key, with backticks stripped
|
||||
- URLs/domains if they appear later
|
||||
|
||||
For the current data:
|
||||
|
||||
```text
|
||||
reference:path:.opencode-agenthub/current/xdg/opencode/opencode.json
|
||||
project:phrase:opencode-agenthub plugin system
|
||||
```
|
||||
|
||||
When two entries share the same identity key, merge them by keeping the more useful text:
|
||||
|
||||
1. explicit source beats manual beats compaction
|
||||
2. higher confidence beats lower confidence
|
||||
3. more specific text beats vague text, usually longer but cap this to avoid keeping rambles
|
||||
4. newer beats older if specificity/source/confidence tie
|
||||
|
||||
This directly fixes:
|
||||
|
||||
- `OpenCode plugin config location: ...` vs `OpenCode plugin config: ...`
|
||||
- Chinese and English variants that both mention `opencode-agenthub plugin system`
|
||||
|
||||
#### 3. Add conservative lexical similarity only inside same type
|
||||
|
||||
Use token Jaccard or Dice similarity over normalized tokens after stopword removal. No new dependencies.
|
||||
|
||||
Suggested thresholds:
|
||||
|
||||
```text
|
||||
project/reference: >= 0.72 duplicate
|
||||
feedback: >= 0.70 possible duplicate if same topic anchor exists
|
||||
decision: do not use fuzzy deletion by default
|
||||
```
|
||||
|
||||
This should be a fallback after identity keys, not the primary system.
|
||||
|
||||
Risk: fuzzy matching can delete nearby but distinct decisions. Example: "Markdown headers cause purple text" and "Plain text labels avoid special markup" are related but both useful in the history of the bug.
|
||||
|
||||
Keep fuzzy matching conservative and type-scoped.
|
||||
|
||||
#### 4. Use explicit supersession for decisions
|
||||
|
||||
Decision duplication is fundamentally different. Decisions often form a timeline. Some are still valuable context, some are obsolete.
|
||||
|
||||
The pair below is supersession, not duplication:
|
||||
|
||||
```text
|
||||
Parser supports 3 formats: HTML comment, Markdown section, legacy XML
|
||||
Parser supports 4 formats: plain text label, Markdown section, legacy section name, legacy XML
|
||||
```
|
||||
|
||||
The right model is: newer active decision supersedes older active decision on the same topic.
|
||||
|
||||
Keep this simple. Do not build a knowledge graph.
|
||||
|
||||
Add a small `decisionTopicKey(text)` heuristic:
|
||||
|
||||
```text
|
||||
parser supports <n> formats → decision:parser-supported-formats
|
||||
solution: use ... → decision:purple-italic-output-format, if text contains purple/italic/markup/markdown/xml/html/comment/label
|
||||
use output.prompt ... template → decision:compaction-template-replacement
|
||||
opencode plugin load/config facts → decision:plugin-loading-config
|
||||
```
|
||||
|
||||
That sounds bespoke, but that is acceptable here. The repo is small, the memory types are product-specific, and the current bad entries are product-specific. Boring beats clever.
|
||||
|
||||
When same decision topic appears:
|
||||
|
||||
- keep the newest active entry as active
|
||||
- optionally mark the older entry `status: "superseded"` if the type supports it, or drop it during normalization if old status values are not preserved
|
||||
- do not render superseded entries
|
||||
|
||||
If preserving history matters later, add `supersededBy?: string` and `supersededAt?: string` to the type. Not needed for the first fix.
|
||||
|
||||
### Type-specific policy
|
||||
|
||||
| Type | Nature | Recommended dedupe | Keep history? |
|
||||
|---|---|---|---|
|
||||
| `project` | stable facts about repo/system | identity key + conservative similarity | no, keep one canonical fact |
|
||||
| `reference` | pointer to path/URL/config | path/URL/entity key | no, keep one canonical pointer |
|
||||
| `feedback` | user preference or resolved issue | topic key + newer wins for same issue | usually no |
|
||||
| `decision` | implementation choice over time | topic supersession, not fuzzy duplicate deletion | sometimes, but render only active latest |
|
||||
|
||||
## Problem 2: stale entries never cleaned
|
||||
|
||||
### Diagnosis
|
||||
|
||||
`staleAfterDays` exists, but only `renderEntry()` uses it to append `[Xd old, verify]`. Nothing removes or demotes stale entries. As a result, the store is monotonic until `maxEntries` forces a priority trim.
|
||||
|
||||
That trim is the wrong cleanup mechanism. It sorts by type/source/confidence, not usefulness. A stale high-priority decision can beat a fresh low-priority reference.
|
||||
|
||||
### When to prune
|
||||
|
||||
Prune during storage normalization, not render.
|
||||
|
||||
`normalizeWorkspaceMemory()` is already called by `load/save/updateWorkspaceMemory()`. That gives one central place to enforce lifecycle rules.
|
||||
|
||||
```text
|
||||
load/update/save
|
||||
│
|
||||
▼
|
||||
normalizeWorkspaceMemory()
|
||||
│
|
||||
├─ drop inactive/superseded from active set
|
||||
├─ exact dedupe
|
||||
├─ identity dedupe
|
||||
├─ supersession
|
||||
├─ stale lifecycle pruning
|
||||
└─ maxEntries trim
|
||||
```
|
||||
|
||||
Do not prune only on render. Render is presentation. If render hides or labels stale entries while the JSON keeps growing, the system still rots.
|
||||
|
||||
Do not require explicit cleanup as the only path. It will not run often enough. An explicit cleanup command can be added later for manual inspection, but automatic normalization should handle the common case.
|
||||
|
||||
### Should `staleAfterDays` be enforced?
|
||||
|
||||
Yes, but not uniformly as immediate deletion for every type.
|
||||
|
||||
`staleAfterDays` means "this should be revalidated after this age." It does not always mean "delete at this age."
|
||||
|
||||
Use a two-tier lifecycle:
|
||||
|
||||
```text
|
||||
fresh age <= staleAfterDays
|
||||
stale staleAfterDays < age <= staleAfterDays + grace
|
||||
prunable age > staleAfterDays + grace
|
||||
```
|
||||
|
||||
Suggested grace periods:
|
||||
|
||||
| Type | Current staleAfterDays | Grace | Auto-prune? | Rationale |
|
||||
|---|---:|---:|---|---|
|
||||
| `feedback` | none | none | no age-based prune | User preference can remain valid indefinitely. Prune only by supersession/topic replacement. |
|
||||
| `decision` | 45 | 15 | yes if compaction/manual and not explicit | Implementation decisions age fast. Supersession should remove most earlier. |
|
||||
| `project` | 60 | 30 | yes if compaction/manual and no strong identity/path | Project facts change slower. Keep explicit project facts unless replaced. |
|
||||
| `reference` | 90 | 30 | yes if path no longer exists or prunable age exceeded | References are rediscoverable and can become stale. |
|
||||
|
||||
For the first implementation, a simpler rule is enough:
|
||||
|
||||
```text
|
||||
Never age-prune feedback.
|
||||
Never age-prune explicit entries automatically.
|
||||
Drop compaction/manual entries when age > staleAfterDays + 30 days.
|
||||
Drop superseded entries immediately from the active set.
|
||||
```
|
||||
|
||||
This keeps user-owned memory safe while preventing compaction sludge.
|
||||
|
||||
### Explicit vs implicit contradiction detection
|
||||
|
||||
Use explicit supersession for known memory shapes. Do not try general contradiction detection.
|
||||
|
||||
General contradiction detection without LLM or embeddings is brittle. With an LLM it is nondeterministic and adds another model-quality surface. The current problem does not need that.
|
||||
|
||||
Recommended model:
|
||||
|
||||
- explicit supersession for same decision topic, same reference path, same project entity, same feedback topic
|
||||
- newer entry wins inside the same topic unless older has higher source priority
|
||||
- if `source === "explicit"`, require a newer explicit entry to replace it, or keep both
|
||||
|
||||
This gives predictable behavior and avoids deleting user instructions because a compaction guessed a replacement.
|
||||
|
||||
## Concrete implementation plan
|
||||
|
||||
### P0: centralize deterministic cleanup in `src/workspace-memory.ts`
|
||||
|
||||
Add helpers near `canonicalMemoryText()`:
|
||||
|
||||
```text
|
||||
normalizedTokens(text)
|
||||
extractPathKeys(text)
|
||||
memoryIdentityKeys(entry)
|
||||
decisionTopicKey(text)
|
||||
feedbackTopicKey(text)
|
||||
isPrunableByAge(entry, now)
|
||||
chooseBetterMemory(existing, candidate)
|
||||
```
|
||||
|
||||
Then change `enforceLongTermLimits(entries)` to run in phases:
|
||||
|
||||
```text
|
||||
1. keep active entries only
|
||||
2. truncate text
|
||||
3. drop entries prunable by age, except feedback and explicit
|
||||
4. exact canonical dedupe
|
||||
5. identity-key dedupe for project/reference/feedback
|
||||
6. decision-topic supersession
|
||||
7. sort by priority with freshness as a tie-breaker
|
||||
8. slice to maxEntries
|
||||
```
|
||||
|
||||
Add freshness to `priority()` or to the final sort tie-breaker. Do not let 90-day-old compaction entries beat fresh entries just because type weight is higher.
|
||||
|
||||
Minimal version:
|
||||
|
||||
```text
|
||||
priority desc, source priority desc, freshness desc, updatedAt desc
|
||||
```
|
||||
|
||||
### P1: improve compaction prompt
|
||||
|
||||
Update `buildCompactionPrompt()` with dedupe instructions before the `Memory candidates:` examples.
|
||||
|
||||
Keep this short. Long prompts invite drift.
|
||||
|
||||
### P1: add tests before changing behavior
|
||||
|
||||
Use `tests/workspace-memory.test.ts` for normalization behavior.
|
||||
|
||||
Required regression tests:
|
||||
|
||||
```text
|
||||
CODE PATH COVERAGE
|
||||
==================
|
||||
[+] enforceLongTermLimits(entries)
|
||||
├── [GAP] exact canonical duplicate still dedupes
|
||||
├── [GAP] project opencode-agenthub bilingual/long-short variants collapse to one
|
||||
├── [GAP] reference same config path variants collapse to one
|
||||
├── [GAP] decision parser 4 formats supersedes parser 3 formats
|
||||
├── [GAP] feedback purple/italic newer fix supersedes older fix
|
||||
├── [GAP] stale compaction decision older than staleAfterDays + grace is pruned
|
||||
├── [GAP] stale explicit decision is retained
|
||||
└── [GAP] maxEntries trim runs after dedupe/prune
|
||||
|
||||
[+] renderWorkspaceMemory(store)
|
||||
└── [GAP] does not render superseded/pruned entries
|
||||
```
|
||||
|
||||
No E2E needed. These are pure functions and deterministic store normalization paths.
|
||||
|
||||
### P2: optional explicit cleanup command
|
||||
|
||||
Later, add a manual cleanup/report command that prints:
|
||||
|
||||
- duplicates removed
|
||||
- superseded decisions
|
||||
- stale entries pruned
|
||||
- entries retained because explicit
|
||||
|
||||
Not needed for the first fix. Useful for trust once memory stores grow.
|
||||
|
||||
## Why not embeddings
|
||||
|
||||
Embeddings are the wrong tool at this scale.
|
||||
|
||||
Costs:
|
||||
|
||||
- new dependency/API or local model decision
|
||||
- cache/versioning problem for embedding vectors
|
||||
- nondeterministic thresholds
|
||||
- hard-to-debug deletions
|
||||
- privacy and offline behavior questions
|
||||
|
||||
The current store has 22 entries. The failures are obvious strings, paths, topics, and versioned decisions. Use deterministic rules now. Reconsider embeddings only if stores grow into hundreds of entries and lexical/topic rules fail in real usage.
|
||||
|
||||
## Risks and tradeoffs
|
||||
|
||||
### Risk: deleting useful historical decisions
|
||||
|
||||
Mitigation: do not apply broad fuzzy dedupe to `decision`. Use topic-specific supersession only for known patterns. Keep explicit entries unless explicitly replaced.
|
||||
|
||||
### Risk: bespoke topic keys become a pile of regexes
|
||||
|
||||
Mitigation: keep the first version tiny and test-driven. Add keys only for observed failures. If this grows past roughly 10 topic rules, revisit the model.
|
||||
|
||||
### Risk: prompt-only fix gives false confidence
|
||||
|
||||
Mitigation: prompt change is P1, storage normalization is P0. The store must protect itself.
|
||||
|
||||
### Risk: stale pruning removes something still useful
|
||||
|
||||
Mitigation: no age pruning for feedback, no automatic age pruning for explicit entries, and grace periods for compaction/manual entries.
|
||||
|
||||
### Risk: normalization mutates existing stores unexpectedly
|
||||
|
||||
Mitigation: add tests with fixtures from the current store. Consider logging cleanup counts in development if a logging channel exists. The output should be deterministic.
|
||||
|
||||
## NOT in scope
|
||||
|
||||
- Embedding similarity, too much machinery for 22 entries.
|
||||
- LLM-based contradiction detection, nondeterministic and hard to test.
|
||||
- Full memory history graph with `supersededBy`, useful later but not required for current rendering quality.
|
||||
- New cleanup UI or CLI, optional P2 after deterministic normalization lands.
|
||||
- Changing `LongTermMemoryEntry` schema, avoid migration unless history preservation becomes required.
|
||||
|
||||
## Prioritized steps
|
||||
|
||||
1. **P0: Add tests in `tests/workspace-memory.test.ts` using the concrete duplicate examples from the current store.** This locks the desired behavior before touching cleanup logic.
|
||||
2. **P0: Implement storage-time cleanup in `enforceLongTermLimits()`.** Exact dedupe, identity-key dedupe, decision supersession, stale pruning, then max-entry trim.
|
||||
3. **P0: Make stale lifecycle enforceable but conservative.** No age pruning for feedback or explicit entries. Prune compaction/manual entries after `staleAfterDays + 30`.
|
||||
4. **P1: Tighten `buildCompactionPrompt()` to avoid re-emitting existing memories and emit only replacing facts.** This reduces future noise but is not trusted as the only defense.
|
||||
5. **P1: Add regression fixtures matching the real `workspace-memory.json` problem set.** Assert resulting entries are below the current 22 and contain the newer/canonical facts.
|
||||
6. **P2: Add a cleanup report command only if users need visibility.** Defer until after the automatic path proves itself.
|
||||
|
||||
## Final architecture decision
|
||||
|
||||
The memory store should be self-cleaning at its storage boundary.
|
||||
|
||||
Use prompt engineering to reduce bad candidates, but make `src/workspace-memory.ts` the authority for what persists. Use deterministic, type-aware dedupe instead of embeddings. Treat `project` and `reference` as entity identity problems, `feedback` as topic replacement, and `decision` as explicit supersession.
|
||||
|
||||
That is the smallest design that solves the real failures without turning a 28-entry JSON file into a search platform.
|
||||
|
||||
## Addendum: bracketless memory candidate format from real compaction
|
||||
|
||||
Date: 2026-04-26
|
||||
|
||||
### Summary table
|
||||
|
||||
| Issue | Severity | Fix | Priority |
|
||||
|-------|----------|-----|----------|
|
||||
| Parser silently drops `- project text` bracketless candidates | High | Accept both `- [type] text` and `- type text` | P0 |
|
||||
| Prompt examples imply brackets but do not explicitly require exact syntax | Medium | Add "Use exactly this format, including square brackets" plus a negative example | P0, same small patch |
|
||||
| No regression test for bracketless candidate lines | High | Add parser test covering all four types in bracketless form | P0 |
|
||||
| Future compactions may re-extract useful facts with changed counts or wording | Medium | Keep storage-time type-aware dedupe/staleness plan | P0, unchanged |
|
||||
|
||||
### 1. Parser fix
|
||||
|
||||
Accept `- type text` with no brackets.
|
||||
|
||||
Also strengthen the prompt. Do both.
|
||||
|
||||
The parser is the product boundary. Model output is not a contract, it is an input from an unreliable narrator with excellent vibes. If the model emits a plainly parseable, semantically valid candidate, dropping it silently is a data loss bug.
|
||||
|
||||
The prompt should still ask for the preferred bracketed format because bracketed type markers are less ambiguous. But prompt enforcement alone is not enough. The new evidence proves the model sometimes drops brackets even when examples include them.
|
||||
|
||||
Recommended parser behavior:
|
||||
|
||||
- preferred: `- [project] pathology-playground 後端健康改進計劃已完成 Phase 1-4`
|
||||
- accepted fallback: `- project pathology-playground 後端健康改進計劃已完成 Phase 1-4`
|
||||
- still reject unknown types
|
||||
- still run `shouldAcceptWorkspaceMemoryCandidate()`
|
||||
- still require body length and existing quality gates
|
||||
|
||||
### 2. Prompt format enforcement
|
||||
|
||||
Yes, add explicit syntax instructions.
|
||||
|
||||
Current prompt shows examples, but examples are not a hard enough constraint. Add one sentence before the examples:
|
||||
|
||||
```text
|
||||
Use exactly this candidate format, including square brackets around the type:
|
||||
```
|
||||
|
||||
Then keep the examples:
|
||||
|
||||
```text
|
||||
Memory candidates:
|
||||
- [feedback] content
|
||||
- [project] content
|
||||
- [decision] content
|
||||
- [reference] content
|
||||
```
|
||||
|
||||
Optionally add one short warning:
|
||||
|
||||
```text
|
||||
Do not write `- project content`; write `- [project] content`.
|
||||
```
|
||||
|
||||
Keep this short. Long formatting lectures increase prompt surface area and make the summary worse. One positive instruction plus one negative example is enough.
|
||||
|
||||
### 3. Impact on dedup plan
|
||||
|
||||
Parser robustness moves to P0, before storage dedup/staleness cleanup.
|
||||
|
||||
This changes sequencing, not the architecture.
|
||||
|
||||
Updated P0 order:
|
||||
|
||||
1. **P0a: Fix parser format tolerance and add regression tests.** Lost memory is worse than duplicate memory. A deduper cannot dedupe entries that never made it into the store.
|
||||
2. **P0b: Implement storage-time dedupe and stale pruning.** Still the main long-term quality fix.
|
||||
3. **P0c: Tighten prompt format instruction in the same small patch as parser tolerance.** Cheap and reduces fallback-parser usage.
|
||||
|
||||
The earlier recommendation still stands: storage normalization remains the authority for duplicates and staleness. This new evidence adds a more basic ingestion reliability bug in front of it.
|
||||
|
||||
### 4. Concrete implementation recommendation
|
||||
|
||||
#### Regex change
|
||||
|
||||
Replace the current parser line in `src/extractors.ts:parseWorkspaceMemoryCandidates()`:
|
||||
|
||||
```ts
|
||||
const item = line.trim().match(/^-\s*\[(feedback|project|decision|reference)\]\s*(.+)$/i);
|
||||
```
|
||||
|
||||
with a single regex that accepts bracketed and bracketless forms:
|
||||
|
||||
```ts
|
||||
const item = line.trim().match(
|
||||
/^-\s*(?:\[(feedback|project|decision|reference)\]|(feedback|project|decision|reference)\b)\s+(.+)$/i,
|
||||
);
|
||||
if (!item) continue;
|
||||
|
||||
const type = (item[1] ?? item[2]).toLowerCase() as LongTermType;
|
||||
const body = item[3].trim();
|
||||
```
|
||||
|
||||
Why this shape:
|
||||
|
||||
- `(?:[type]|type\b)` accepts both formats
|
||||
- `\b` prevents `projectile` from being parsed as `project`
|
||||
- `\s+(.+)` requires real content after the type
|
||||
- unknown types still fail
|
||||
|
||||
Even better for readability, avoid duplicate type alternation with a named group if the runtime target supports it cleanly:
|
||||
|
||||
```ts
|
||||
const item = line.trim().match(
|
||||
/^-\s*(?:\[(?<bracketed>feedback|project|decision|reference)\]|(?<plain>feedback|project|decision|reference)\b)\s+(?<body>.+)$/i,
|
||||
);
|
||||
if (!item?.groups) continue;
|
||||
|
||||
const type = (item.groups.bracketed ?? item.groups.plain).toLowerCase() as LongTermType;
|
||||
const body = item.groups.body.trim();
|
||||
```
|
||||
|
||||
Recommendation: use the non-named-group version. It is uglier, but it is maximally boring and consistent with the existing code style.
|
||||
|
||||
Add tests in `tests/extractors.test.ts`:
|
||||
|
||||
```ts
|
||||
test("parseWorkspaceMemoryCandidates accepts bracketless candidate format", () => {
|
||||
const summary = `
|
||||
Memory candidates:
|
||||
- project pathology-playground 後端健康改進計劃已完成 Phase 1-4
|
||||
- reference Scrypt 參數必須是 N=16384, r=8, p=1
|
||||
- feedback 端口 9473 可能被舊進程佔用,需殺掉後重啟
|
||||
- decision Use output.prompt to replace the default compaction template
|
||||
`;
|
||||
|
||||
const items = parseWorkspaceMemoryCandidates(summary);
|
||||
|
||||
assert.equal(items.length, 4);
|
||||
assert.deepEqual(items.map(item => item.type), [
|
||||
"project",
|
||||
"reference",
|
||||
"feedback",
|
||||
"decision",
|
||||
]);
|
||||
});
|
||||
```
|
||||
|
||||
Also add a guard test:
|
||||
|
||||
```ts
|
||||
test("parseWorkspaceMemoryCandidates rejects unknown bracketless candidate type", () => {
|
||||
const summary = `
|
||||
Memory candidates:
|
||||
- note this should not be parsed as memory
|
||||
`;
|
||||
|
||||
const items = parseWorkspaceMemoryCandidates(summary);
|
||||
|
||||
assert.equal(items.length, 0);
|
||||
});
|
||||
```
|
||||
|
||||
#### Prompt change
|
||||
|
||||
In `src/plugin.ts:buildCompactionPrompt()`, change this block:
|
||||
|
||||
```ts
|
||||
"At the end of the summary, extract durable memory entries for future",
|
||||
"sessions using these labels:",
|
||||
"",
|
||||
"Memory candidates:",
|
||||
"- [feedback] content",
|
||||
"- [project] content",
|
||||
"- [decision] content",
|
||||
"- [reference] content",
|
||||
```
|
||||
|
||||
to:
|
||||
|
||||
```ts
|
||||
"At the end of the summary, extract durable memory entries for future",
|
||||
"sessions using exactly this candidate format, including square brackets around the type:",
|
||||
"",
|
||||
"Memory candidates:",
|
||||
"- [feedback] content",
|
||||
"- [project] content",
|
||||
"- [decision] content",
|
||||
"- [reference] content",
|
||||
"",
|
||||
"Do not write '- project content'; write '- [project] content'.",
|
||||
```
|
||||
|
||||
This gives the model a crisp positive format and a concrete anti-pattern. The parser still accepts the anti-pattern because users need data capture more than format purity.
|
||||
|
||||
### Final addendum decision
|
||||
|
||||
Parser tolerance is now P0.
|
||||
|
||||
The architecture stays the same: make the storage layer self-cleaning, and make ingestion defensive. But the implementation sequence changes because silent data loss beats duplicate accumulation in severity. First capture valid candidates reliably. Then dedupe and prune them.
|
||||
|
||||
## Addendum 2: content quality guidance
|
||||
|
||||
Date: 2026-04-26
|
||||
|
||||
### Summary table
|
||||
|
||||
| Issue | Severity | Fix | Priority |
|
||||
|-------|----------|-----|----------|
|
||||
| Model extracts low-durability progress snapshots as `project` memory | High | Add durable-content guidance to compaction prompt | P0 |
|
||||
| Exact counts like `1237 tests pass` and `37 files` churn across sessions | High | Add parser quality filter for obvious snapshot patterns | P0 |
|
||||
| Stable config values are useful and should still pass | Medium | Keep `reference` guidance permissive for config/crypto/PIN values | P0 |
|
||||
| Environment issues like occupied ports may be useful briefly but not long-term | Medium | Prompt says unresolved issues only; storage staleness handles aging | P1 with staleness work |
|
||||
|
||||
### 1. Architecture fit
|
||||
|
||||
This belongs in both the prompt and the parser, with different responsibilities.
|
||||
|
||||
The prompt should teach the model what "durable" means. The model is choosing what to extract, so it needs product semantics:
|
||||
|
||||
- stable configuration values are good memory
|
||||
- unresolved bugs can be useful memory
|
||||
- exact test counts, file counts, and phase progress are usually bad long-term memory
|
||||
|
||||
The parser should still reject obvious low-durability snapshots as a backstop. The parser already has `shouldAcceptWorkspaceMemoryCandidate()` in `src/extractors.ts`; this is exactly where simple content-quality gates belong.
|
||||
|
||||
Do not put subtle semantic judgment in the parser. Do put obvious anti-patterns there.
|
||||
|
||||
Recommended split:
|
||||
|
||||
```text
|
||||
Prompt
|
||||
└─ positive/negative guidance for durable memory selection
|
||||
|
||||
Parser quality gate
|
||||
└─ deterministic rejection of obvious snapshots
|
||||
- exact test counts
|
||||
- exact file counts
|
||||
- completed Phase N-M progress lines
|
||||
- temporary port/process cleanup notes when phrased as resolved/current env state
|
||||
|
||||
Storage normalization
|
||||
└─ dedupe, supersession, age-based pruning
|
||||
```
|
||||
|
||||
This is the same design principle as the bracketless parser addendum: ask the model nicely, then make the code defensive.
|
||||
|
||||
### 2. Specificity vs risk
|
||||
|
||||
The proposed guidance is specific, but not too specific.
|
||||
|
||||
It names examples from the observed failure mode, but the rule underneath is general: facts should stay true across sessions. Exact counts and phase numbers are classic snapshot smell in almost every codebase.
|
||||
|
||||
Potential risk: sometimes an exact count is genuinely durable. Example: "USB sync protocol expects exactly 37 manifest entries" could be a stable contract, not a snapshot.
|
||||
|
||||
Mitigation: word the guidance around "session-specific progress" rather than banning all numbers. Keep config values explicitly allowed.
|
||||
|
||||
Good distinction:
|
||||
|
||||
```text
|
||||
Bad: 1237 tests pass today
|
||||
Good: Test suite is expected to pass before handoff
|
||||
|
||||
Bad: USB sync currently has 37 files
|
||||
Good: USB sync covers bundles, server, frontend, tests, and docs
|
||||
|
||||
Bad: Phase 1-4 completed
|
||||
Good: Backend health work is organized into phased improvements
|
||||
|
||||
Good: Scrypt parameters are N=16384, r=8, p=1
|
||||
```
|
||||
|
||||
The first three are progress snapshots. The Scrypt value is a stable configuration contract. Numbers are not the problem. Temporary state is the problem.
|
||||
|
||||
### 3. Prompt length concern
|
||||
|
||||
Adding four lines is worth it.
|
||||
|
||||
This prompt is already making the model do extraction. Without guidance, the model optimizes for "important-looking facts," and progress snapshots look important. That creates churn, duplicates, and stale memory. Four lines preventing bad memory at the source are cheap.
|
||||
|
||||
If trimming is needed, trim redundant formatting language before removing quality guidance. Formatting mistakes lose entries or require parser tolerance. Content mistakes pollute the store. Both matter, but the durable-content guidance carries more product value than repeated Markdown formatting reminders.
|
||||
|
||||
Recommended trim posture:
|
||||
|
||||
- keep one concise formatting instruction
|
||||
- keep one concise candidate syntax instruction
|
||||
- add one concise durable-content block
|
||||
- avoid long examples or taxonomy tables in the prompt
|
||||
|
||||
The prompt should not become a memory policy document. It just needs the model to stop writing "1237 tests pass" into long-term storage. Wild that we have to say this, but we do.
|
||||
|
||||
### 4. Concrete prompt recommendation
|
||||
|
||||
In `src/plugin.ts:buildCompactionPrompt()`, replace the candidate instruction block with this final version:
|
||||
|
||||
```ts
|
||||
"At the end of the summary, extract durable memory entries for future sessions.",
|
||||
"Only extract facts that are likely to stay true across sessions.",
|
||||
"Do not extract session-specific progress like exact test counts, file counts, or phase numbers.",
|
||||
"For progress, extract the stable goal or durable milestone, not the current number.",
|
||||
"For references, extract configuration values that do not usually change between sessions.",
|
||||
"For feedback, extract unresolved issues or user preferences that future sessions need to know.",
|
||||
"Use exactly this candidate format, including square brackets around the type:",
|
||||
"",
|
||||
"Memory candidates:",
|
||||
"- [feedback] content",
|
||||
"- [project] content",
|
||||
"- [decision] content",
|
||||
"- [reference] content",
|
||||
"",
|
||||
"Do not write '- project content'; write '- [project] content'.",
|
||||
```
|
||||
|
||||
This is slightly longer than the lead's proposal, but it avoids an overbroad ban on numbers by saying "session-specific progress." It also gives a positive replacement behavior: stable goal or durable milestone.
|
||||
|
||||
If a shorter version is required, use this:
|
||||
|
||||
```ts
|
||||
"At the end of the summary, extract durable memory entries for future sessions.",
|
||||
"Only extract facts likely to stay true across sessions; skip exact test counts, file counts, phase numbers, and temporary environment state.",
|
||||
"References may include stable configuration values. Feedback should be unresolved issues or user preferences future sessions need.",
|
||||
"Use exactly this candidate format, including square brackets around the type:",
|
||||
```
|
||||
|
||||
Recommendation: use the longer block. The extra three lines buy clarity and reduce accidental over-filtering.
|
||||
|
||||
### Parser quality gate recommendation
|
||||
|
||||
Add deterministic snapshot rejection to `shouldAcceptWorkspaceMemoryCandidate()`.
|
||||
|
||||
Keep this conservative. Reject obvious snapshots, not every number.
|
||||
|
||||
Suggested first-pass rules:
|
||||
|
||||
```ts
|
||||
// Session-specific progress snapshots, not durable memory.
|
||||
if (entry.type === "project") {
|
||||
if (/\b\d+\s+tests?\s+pass(?:ed)?\b/i.test(text)) return false;
|
||||
if (/\b\d+\s+suites?\b/i.test(text)) return false;
|
||||
if (/\b\d+\s+(?:files?|文件)\b/i.test(text)) return false;
|
||||
if (/\bphase\s*\d+(?:\s*[-–]\s*\d+)?\s+(?:completed|done|finished)\b/i.test(text)) return false;
|
||||
if (/已完成\s*Phase\s*\d+(?:\s*[-–]\s*\d+)?/i.test(text)) return false;
|
||||
}
|
||||
```
|
||||
|
||||
Do not reject stable `reference` values containing numbers. These must pass:
|
||||
|
||||
```text
|
||||
Admin PIN 是 456123
|
||||
Scrypt 參數必須是 N=16384, r=8, p=1
|
||||
```
|
||||
|
||||
For `feedback`, do not broadly reject ports yet. A port issue can be useful if it explains a recurring failure. Let staleness prune it, unless the text clearly says the issue was resolved. A future parser rule can reject resolved temporary env notes, but the current evidence is not enough to safely block all port-related feedback.
|
||||
|
||||
### 5. Integration with storage-time dedup/staleness
|
||||
|
||||
Prompt-level guidance and staleness solve different problems.
|
||||
|
||||
Staleness is cleanup after bad or aging facts are already stored. Prompt guidance prevents low-value facts from entering the store in the first place. Parser filtering catches obvious misses when the prompt fails.
|
||||
|
||||
Do not rely on staleness for exact counts.
|
||||
|
||||
Why:
|
||||
|
||||
- `maxEntries` is 28, so a few bad snapshots can evict useful facts before they age out
|
||||
- exact counts will churn every compaction and create near-duplicates
|
||||
- stale labels still consume render budget until pruning runs
|
||||
- users see noisy memory and trust the feature less
|
||||
|
||||
Storage-time dedup/staleness remains required for facts that were good when written but later become outdated. Example: a config path that moves, a decision superseded by a better decision, or an unresolved bug that later gets fixed.
|
||||
|
||||
Use this mental model:
|
||||
|
||||
```text
|
||||
Prompt guidance → prevent bad candidates
|
||||
Parser quality gate → reject obvious bad candidates
|
||||
Storage dedupe → merge repeated good candidates
|
||||
Storage staleness → retire once-good candidates that aged out
|
||||
```
|
||||
|
||||
### Updated priority
|
||||
|
||||
The new content-quality evidence adds another P0 ingestion fix.
|
||||
|
||||
Updated sequence:
|
||||
|
||||
1. **P0a: Parser accepts bracketless candidate format and tests it.** Prevent silent data loss.
|
||||
2. **P0b: Prompt durable-content guidance.** Stop obvious snapshots at the source.
|
||||
3. **P0c: Parser rejects obvious low-durability `project` snapshots.** Backstop the prompt with deterministic filters.
|
||||
4. **P0d: Storage-time dedupe and staleness.** Still required for duplicate accumulation and lifecycle cleanup.
|
||||
|
||||
### Final addendum 2 decision
|
||||
|
||||
Add the durable-content guidance to the prompt and add conservative parser filters for obvious `project` snapshots.
|
||||
|
||||
This does not replace storage-time dedupe or staleness. It reduces garbage before it reaches that layer. The store still needs to clean itself, but it should not be used as a trash compactor for facts we already know are temporary.
|
||||
@@ -982,12 +982,34 @@ for (const { text, expected } of QUALITY_GATE_TESTS) {
|
||||
### 本週:PR-1
|
||||
|
||||
1. Baseline snapshot
|
||||
2. Task 1: inline exitCode + 收窄 extractErrorsFromBash + **plugin hook regression test**
|
||||
3. Task 2: budget-aware render + **min envelope handling**
|
||||
4. Task 3: remove bare `always` + **ensure all patterns have `g` flag**
|
||||
2. Task 1: inline exitCode + 收窄 extractErrorsFromBash + **plugin hook regression test** ✅ DONE
|
||||
3. Task 2: budget-aware render + **min envelope handling** ✅ DONE
|
||||
4. Task 3: remove bare `always` + **ensure all patterns have `g` flag** ✅ DONE
|
||||
5. Manual verification
|
||||
6. Cleanup false positives
|
||||
|
||||
### Hotfix: 紫色斜體渲染問題
|
||||
|
||||
**問題**:Plugin compaction context 輸出在 OpenCode UI 中顯示為紫色斜體。
|
||||
|
||||
**根因分析**:
|
||||
1. 第一次嘗試:XML 標籤 `<workspace_memory>` → 紫色斜體
|
||||
2. 第二次嘗試:HTML 註釋 `<!-- workspace_memory_candidates -->` → 仍然紫色斜體
|
||||
3. 第三次嘗試:Markdown 標題 `## Memory Candidates` → 紫色(無斜體)
|
||||
4. 第四次嘗試:純文本標籤 `Memory candidates:` → 無特殊渲染 ✅
|
||||
|
||||
**解決方案**:架構師建議使用純文本標籤,避免所有 Markdown/XML/HTML 語法。
|
||||
|
||||
**修改內容**:
|
||||
- `src/plugin.ts`: `compactionContextHeader()` 改用 `Memory candidates:` 標籤
|
||||
- `src/plugin.ts`: `renderTodosForCompaction()` 改用 `Pending todos:` 標籤
|
||||
- `src/extractors.ts`: `extractCandidateBlock()` 支援純文本格式解析(primary)
|
||||
- `src/workspace-memory.ts`: `renderWorkspaceMemory()` 使用純文本 `Workspace memory:` 標籤
|
||||
- `src/session-state.ts`: `renderHotSessionState()` 使用純文本 `Hot session state:` 標籤
|
||||
- 移除 `stripXmlTags()` 函數(不再需要)
|
||||
|
||||
**測試**:42 個測試全部通過。
|
||||
|
||||
### 下週:PR-2
|
||||
|
||||
5. Task 5: canonical exact dedupe + **source priority**
|
||||
|
||||
+1
-1
@@ -1,6 +1,6 @@
|
||||
{
|
||||
"name": "opencode-working-memory",
|
||||
"version": "1.2.0",
|
||||
"version": "1.2.1",
|
||||
"description": "Three-layer memory architecture for OpenCode with workspace memory and hot session state",
|
||||
"type": "module",
|
||||
"main": "index.ts",
|
||||
|
||||
+51
-8
@@ -193,8 +193,12 @@ function shouldAcceptWorkspaceMemoryCandidate(entry: {
|
||||
}): boolean {
|
||||
const text = entry.text.trim();
|
||||
|
||||
// Too short
|
||||
if (text.length < 20) return false;
|
||||
// Too short (with type-specific allowlist for stable config values)
|
||||
if (entry.type === "reference" && /\b(?:admin\s+)?pin\s|scrypt|n=\d+|r=\d+|p=\d+/i.test(text)) {
|
||||
// Stable config values can be short — allow below generic min length
|
||||
} else if (text.length < 20) {
|
||||
return false;
|
||||
}
|
||||
|
||||
// Git history / commit hash
|
||||
if (/\b[0-9a-f]{7,40}\b/.test(text)) return false;
|
||||
@@ -218,21 +222,60 @@ function shouldAcceptWorkspaceMemoryCandidate(entry: {
|
||||
const pathCount = (text.match(/\/[\w.-]+(\/[\w.-]+)+/g) || []).length;
|
||||
if (pathCount > 2) return false;
|
||||
|
||||
// Session-specific progress snapshots for project type
|
||||
if (entry.type === "project") {
|
||||
if (/\b\d+\s+tests?\s+pass(?:ed)?\b/i.test(text)) return false;
|
||||
if (/\b\d+\s+suites?\b/i.test(text)) return false;
|
||||
if (/\b\d+\s*(?:個|个)?\s*(?:files?|文件)/i.test(text)) return false;
|
||||
// Reject "Phase N completed" using semantic window (within 20 chars either direction)
|
||||
if (/\bphase\s*\d+(?:\s*[-–]\s*\d+)?\b.{0,20}\b(?:completed|done|finished)\b/i.test(text)) return false;
|
||||
if (/\b(?:completed|done|finished)\b.{0,20}\bphase\s*\d+(?:\s*[-–]\s*\d+)?\b/i.test(text)) return false;
|
||||
if (/已完成.{0,20}Phase\s*\d+(?:\s*[-–]\s*\d+)?/i.test(text)) return false;
|
||||
if (/Phase\s*\d+(?:\s*[-–]\s*\d+)?.{0,20}已完成/i.test(text)) return false;
|
||||
}
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
/**
|
||||
* Extract candidate block from summary using multiple formats.
|
||||
* Supports: Plain text label, Markdown section, legacy XML.
|
||||
*/
|
||||
function extractCandidateBlock(summary: string): string | null {
|
||||
// 1. Plain text label (primary format, no Markdown header)
|
||||
const plainMatch = summary.match(/Memory candidates:\s*\n([\s\S]*?)(?:\n[A-Z][a-z]+ [a-z]+:|\n##\s|$)/i);
|
||||
if (plainMatch) return plainMatch[1];
|
||||
|
||||
// 2. Markdown section (legacy)
|
||||
const markdownMatch = summary.match(/##\s*Memory Candidates\s*\n([\s\S]*?)(?:\n##\s|$)/i);
|
||||
if (markdownMatch) return markdownMatch[1];
|
||||
|
||||
// 3. Legacy "Workspace Memory Candidates" section
|
||||
const legacyMatch = summary.match(/##\s*Workspace Memory Candidates\s*\n([\s\S]*?)(?:\n##\s|$)/i);
|
||||
if (legacyMatch) return legacyMatch[1];
|
||||
|
||||
// 4. Legacy XML block (backward compatible)
|
||||
const xmlMatch = summary.match(/<workspace_memory_candidates>([\s\S]*?)<\/workspace_memory_candidates>/i);
|
||||
if (xmlMatch) return xmlMatch[1];
|
||||
|
||||
return null;
|
||||
}
|
||||
|
||||
export function parseWorkspaceMemoryCandidates(summary: string): LongTermMemoryEntry[] {
|
||||
const match = summary.match(/<workspace_memory_candidates>([\s\S]*?)<\/workspace_memory_candidates>/i);
|
||||
if (!match) return [];
|
||||
const block = extractCandidateBlock(summary);
|
||||
if (!block) return [];
|
||||
|
||||
const now = new Date().toISOString();
|
||||
const entries: LongTermMemoryEntry[] = [];
|
||||
|
||||
for (const line of match[1].split("\n")) {
|
||||
const item = line.trim().match(/^-\s*\[(feedback|project|decision|reference)\]\s*(.+)$/i);
|
||||
for (const line of block.split("\n")) {
|
||||
// Accept both "- [type] text" (bracketed) and "- type text" (bracketless)
|
||||
const item = line.trim().match(
|
||||
/^-\s*(?:\[(feedback|project|decision|reference)\]|(feedback|project|decision|reference)\b)\s+(.+)$/i,
|
||||
);
|
||||
if (!item) continue;
|
||||
const type = item[1].toLowerCase() as LongTermType;
|
||||
const body = item[2].trim();
|
||||
const type = (item[1] ?? item[2]).toLowerCase() as LongTermType;
|
||||
const body = item[3].trim();
|
||||
if (body.length < 12) continue;
|
||||
|
||||
// Apply quality gate
|
||||
|
||||
+97
-32
@@ -46,38 +46,83 @@ import {
|
||||
} from "./opencode.ts";
|
||||
|
||||
/**
|
||||
* Generate the memory candidate instruction to include in compaction context.
|
||||
* Build the complete compaction prompt.
|
||||
*
|
||||
* Replaces OpenCode's default template (which uses --- separators that trigger
|
||||
* YAML frontmatter comment scope in markdown rendering, producing purple italic text).
|
||||
* Our template uses only ## Markdown headings and explicitly forbids YAML frontmatter,
|
||||
* horizontal rules, and delimiter lines.
|
||||
*
|
||||
* @param privateContext - Background context (workspace memory, hot session state,
|
||||
* pending todos) from our plugin and any other plugins. Shown to the model to
|
||||
* inform the summary but not copied verbatim.
|
||||
*/
|
||||
function memoryCandidateInstruction(): string {
|
||||
return `
|
||||
At the end of the compaction summary, include:
|
||||
|
||||
<workspace_memory_candidates>
|
||||
- [feedback] ...
|
||||
- [project] ...
|
||||
- [decision] ...
|
||||
- [reference] ...
|
||||
</workspace_memory_candidates>
|
||||
|
||||
Only include durable information useful across future sessions in this exact workspace.
|
||||
Do NOT include active file lists, raw errors, temporary progress, stack traces, code signatures, API docs, git history, or facts easily rediscovered from the repository.
|
||||
For decisions, include rationale in one sentence.
|
||||
If nothing qualifies, output an empty block.
|
||||
`.trim();
|
||||
function buildCompactionPrompt(privateContext: string): string {
|
||||
return [
|
||||
"Provide a detailed summary for continuing our conversation above.",
|
||||
"Focus on information that would help another agent continue the work: the goal, user instructions, completed work, current state, decisions, relevant files, and next steps.",
|
||||
"",
|
||||
"Do not call any tools. Respond only with the summary text.",
|
||||
"Respond in the same language as the user's messages in the conversation.",
|
||||
"",
|
||||
"Formatting rules:",
|
||||
"- Start the response with \"## Goal\".",
|
||||
"- Use Markdown headings only.",
|
||||
"- Do not output YAML frontmatter.",
|
||||
"- Do not output horizontal rules.",
|
||||
"- Do not wrap the summary in delimiter lines such as ---.",
|
||||
"- Do not use code fences around the summary.",
|
||||
"",
|
||||
"Use this structure:",
|
||||
"",
|
||||
"## Goal",
|
||||
"",
|
||||
"## Instructions",
|
||||
"",
|
||||
"## Progress",
|
||||
"",
|
||||
"## Key Decisions",
|
||||
"",
|
||||
"## Discoveries",
|
||||
"",
|
||||
"## Next Steps",
|
||||
"",
|
||||
"## Relevant Files",
|
||||
"",
|
||||
"At the end of the summary, extract durable memory entries for future sessions.",
|
||||
"Only extract facts that are likely to stay true across sessions.",
|
||||
"Do not extract session-specific progress like exact test counts, file counts, or phase numbers.",
|
||||
"For progress, extract the stable goal or durable milestone, not the current number.",
|
||||
"For references, extract configuration values that do not usually change between sessions.",
|
||||
"For feedback, extract unresolved issues or user preferences that future sessions need to know.",
|
||||
"Use exactly this candidate format, including square brackets around the type:",
|
||||
"",
|
||||
"Memory candidates:",
|
||||
"- [feedback] content",
|
||||
"- [project] content",
|
||||
"- [decision] content",
|
||||
"- [reference] content",
|
||||
"",
|
||||
"Do not write '- project content'; write '- [project] content'.",
|
||||
"",
|
||||
"Background context, use this to inform the summary above.",
|
||||
"Do not output this context verbatim:",
|
||||
"",
|
||||
privateContext,
|
||||
].join("\n");
|
||||
}
|
||||
|
||||
/**
|
||||
* Render todos for compaction context.
|
||||
* Render todos for compaction context (plain text format, no Markdown headers).
|
||||
*/
|
||||
function renderTodos(todos: Array<{ content: string; status: string; priority?: string }>): string {
|
||||
function renderTodosForCompaction(todos: Array<{ content: string; status: string; priority?: string }>): string {
|
||||
if (todos.length === 0) return "";
|
||||
|
||||
const lines = ["<pending_todos>"];
|
||||
const lines = ["Pending todos:"];
|
||||
for (const todo of todos) {
|
||||
const priority = todo.priority ? ` [${todo.priority}]` : "";
|
||||
lines.push(`- ${todo.content}${priority}`);
|
||||
const status = todo.status === "completed" ? "✓" : todo.status === "in_progress" ? "→" : "○";
|
||||
lines.push(`- ${status} ${todo.content}${priority}`);
|
||||
}
|
||||
lines.push("</pending_todos>");
|
||||
return lines.join("\n");
|
||||
}
|
||||
|
||||
@@ -292,7 +337,17 @@ export const MemoryV2Plugin: Plugin = async (input) => {
|
||||
await processLatestUserMessage(sessionID);
|
||||
},
|
||||
|
||||
// Add compaction context before summarization
|
||||
/**
|
||||
* Replace the default compaction prompt with a ---free template.
|
||||
*
|
||||
* OpenCode's default template wraps sections in --- separators. When the
|
||||
* model follows the template (which our structured context encourages),
|
||||
* the TUI renders --- at position 0 as YAML frontmatter, applying the
|
||||
* "comment" syntax scope (purple italic in palenight theme).
|
||||
*
|
||||
* We set output.prompt to replace the entire prompt, removing all ---
|
||||
* and explicitly forbidding YAML frontmatter / horizontal rules.
|
||||
*/
|
||||
"experimental.session.compacting": async (hookInput, output) => {
|
||||
const { sessionID } = hookInput;
|
||||
if (!sessionID) return;
|
||||
@@ -300,7 +355,12 @@ export const MemoryV2Plugin: Plugin = async (input) => {
|
||||
// Sub-agents don't need compaction support
|
||||
if (await isSubAgent(sessionID)) return;
|
||||
|
||||
// Add compaction context with memory, hot state, todos, and instruction
|
||||
// Preserve context injected by other plugins that ran before us.
|
||||
// Setting output.prompt bypasses the default prompt + context join,
|
||||
// so we must explicitly carry forward any existing output.context.
|
||||
const otherContext = output.context.filter(Boolean).join("\n\n");
|
||||
|
||||
// Build our private context (workspace memory, hot state, todos)
|
||||
const contextParts: string[] = [];
|
||||
|
||||
// 1. Frozen workspace memory
|
||||
@@ -319,18 +379,23 @@ export const MemoryV2Plugin: Plugin = async (input) => {
|
||||
|
||||
// 3. Pending todos from OpenCode
|
||||
const todos = await pendingTodos(client, sessionID);
|
||||
const todosPrompt = renderTodos(todos);
|
||||
const todosPrompt = renderTodosForCompaction(todos);
|
||||
if (todosPrompt) {
|
||||
contextParts.push(todosPrompt);
|
||||
}
|
||||
|
||||
// 4. Memory candidate instruction
|
||||
contextParts.push(memoryCandidateInstruction());
|
||||
// Combine: other plugins' context first, then our private context
|
||||
const privateContext = [otherContext, ...contextParts]
|
||||
.filter(Boolean)
|
||||
.join("\n\n");
|
||||
|
||||
// Add to compaction context (output.context is an array)
|
||||
for (const part of contextParts) {
|
||||
output.context.push(part);
|
||||
}
|
||||
// Replace the default prompt entirely with our ---free template
|
||||
output.prompt = buildCompactionPrompt(privateContext);
|
||||
|
||||
// Clear context array since we consumed it into output.prompt.
|
||||
// Subsequent plugins that set output.prompt will also need to check
|
||||
// output.context if they want to preserve other plugin contributions.
|
||||
output.context.length = 0;
|
||||
},
|
||||
|
||||
// Handle session events
|
||||
|
||||
@@ -180,7 +180,7 @@ export function renderHotSessionState(state: SessionState, workspaceRoot: string
|
||||
|
||||
if (activeFiles.length === 0 && openErrors.length === 0 && decisions.length === 0) return "";
|
||||
|
||||
const lines: string[] = ["<hot_session_state>"];
|
||||
const lines: string[] = ["Hot session state (current session):"];
|
||||
|
||||
if (activeFiles.length > 0) {
|
||||
lines.push("active_files:");
|
||||
@@ -204,7 +204,6 @@ export function renderHotSessionState(state: SessionState, workspaceRoot: string
|
||||
}
|
||||
}
|
||||
|
||||
lines.push("</hot_session_state>");
|
||||
return lines.join("\n").slice(0, HOT_STATE_LIMITS.maxRenderedChars);
|
||||
}
|
||||
|
||||
|
||||
+162
-21
@@ -76,30 +76,169 @@ function canonicalMemoryText(text: string): string {
|
||||
.trim();
|
||||
}
|
||||
|
||||
/** Extract entity/destination keys for project and reference dedup */
|
||||
function extractEntityKey(text: string): string | null {
|
||||
const normalized = canonicalMemoryText(text);
|
||||
// Check known key phrases (bilingual-friendly)
|
||||
// opencode + agenthub plugin system
|
||||
if (/opencode.*agenthub/i.test(normalized)) {
|
||||
return "opencode-agenthub plugin system";
|
||||
}
|
||||
// For generic config references, fall back to canonical text dedup — no entity key
|
||||
return null;
|
||||
}
|
||||
|
||||
/** Extract decision topic key for supersession detection */
|
||||
function decisionTopicKey(text: string): string | null {
|
||||
const normalized = text.toLowerCase();
|
||||
// Parser format versions
|
||||
if (/parser.*formats?|supports?\s*\d+\s*format/i.test(normalized)) {
|
||||
return "parser-supported-formats";
|
||||
}
|
||||
// Compaction template replacement
|
||||
if (/compaction.*template|output\.prompt|template.*replace/i.test(normalized)) {
|
||||
return "compaction-template-replacement";
|
||||
}
|
||||
// Plugin loading
|
||||
if (/plugin.*load|npm.*cache|plugin.*config/i.test(normalized)) {
|
||||
return "plugin-loading-config";
|
||||
}
|
||||
// Output format changes (purple/italic, YAML frontmatter, etc)
|
||||
if (/purple.*italic|markup|markdown.*render|frontmatter/i.test(normalized)) {
|
||||
return "output-format-rendering";
|
||||
}
|
||||
return null;
|
||||
}
|
||||
|
||||
/** Extract feedback topic key for supersession detection */
|
||||
function feedbackTopicKey(text: string): string | null {
|
||||
const normalized = text.toLowerCase();
|
||||
// Purple/italic rendering issue
|
||||
if (/purple.*italic/i.test(normalized)) {
|
||||
return "purple-italic-rendering";
|
||||
}
|
||||
// Browser login/server errors (500 internal_error)
|
||||
if (/login.*500|500.*internal|internal_error|server.*error/i.test(normalized)) {
|
||||
return "server-error";
|
||||
}
|
||||
// Port occupied / environment issues
|
||||
if (/port.*occup|9473|端口|舊進程|旧进程/i.test(normalized)) {
|
||||
return "port-occupied-environment";
|
||||
}
|
||||
// Theme preferences
|
||||
if (/theme|dark.*light|prefer.*theme/i.test(normalized)) {
|
||||
return "theme-preference";
|
||||
}
|
||||
return null;
|
||||
}
|
||||
|
||||
/** Check if entry should be pruned by age (for compaction/manual entries only) */
|
||||
function isPrunableByAge(entry: LongTermMemoryEntry, now: number): boolean {
|
||||
// Never prune feedback or explicit entries
|
||||
if (entry.type === "feedback") return false;
|
||||
if (entry.source === "explicit") return false;
|
||||
if (!entry.staleAfterDays) return false;
|
||||
|
||||
const createdAt = new Date(entry.createdAt).getTime();
|
||||
const ageDays = (now - createdAt) / 86400000;
|
||||
const grace = 30; // 30-day grace period
|
||||
return ageDays > entry.staleAfterDays + grace;
|
||||
}
|
||||
|
||||
/** Choose better memory when identity/topic keys conflict */
|
||||
function chooseBetterMemory(
|
||||
a: LongTermMemoryEntry,
|
||||
b: LongTermMemoryEntry,
|
||||
mode: "entity" | "supersession" = "entity",
|
||||
): LongTermMemoryEntry {
|
||||
// Source priority: explicit > manual > compaction
|
||||
if (sourcePriority(a.source) !== sourcePriority(b.source)) {
|
||||
return sourcePriority(a.source) > sourcePriority(b.source) ? a : b;
|
||||
}
|
||||
// Higher confidence wins
|
||||
if (a.confidence !== b.confidence) {
|
||||
return a.confidence > b.confidence ? a : b;
|
||||
}
|
||||
// For entity dedup: longer (more specific) beats shorter
|
||||
// For supersession: newer beats older (and thus longer is not preferred)
|
||||
if (mode === "supersession") {
|
||||
// Newer wins for same-topic supersession
|
||||
if (new Date(a.createdAt).getTime() !== new Date(b.createdAt).getTime()) {
|
||||
return new Date(a.createdAt) > new Date(b.createdAt) ? a : b;
|
||||
}
|
||||
return a.text.length > b.text.length ? a : b;
|
||||
}
|
||||
// Entity mode: longer text means more specific
|
||||
if (Math.abs(a.text.length - b.text.length) > 10) {
|
||||
return a.text.length > b.text.length ? a : b;
|
||||
}
|
||||
// Freshness tie-breaker
|
||||
return new Date(a.createdAt) > new Date(b.createdAt) ? a : b;
|
||||
}
|
||||
|
||||
export function enforceLongTermLimits(entries: LongTermMemoryEntry[]): LongTermMemoryEntry[] {
|
||||
const byKey = new Map<string, LongTermMemoryEntry>();
|
||||
const now = Date.now();
|
||||
|
||||
for (const entry of entries.filter(entry => entry.status === "active")) {
|
||||
const text = entry.text.slice(0, LONG_TERM_LIMITS.maxEntryTextChars);
|
||||
const key = `${entry.type}:${canonicalMemoryText(text)}`;
|
||||
// Phase 1: filter active, prune by age
|
||||
const phase1 = entries
|
||||
.filter(entry => entry.status === "active")
|
||||
.filter(entry => !isPrunableByAge(entry, now))
|
||||
.map(entry => ({ ...entry, text: entry.text.slice(0, LONG_TERM_LIMITS.maxEntryTextChars) }));
|
||||
|
||||
const existing = byKey.get(key);
|
||||
// For project/reference/feedback: detect entity keys FIRST, then dedupe by entity OR canonical
|
||||
const projectRefEntries = phase1.filter(e => e.type === "project" || e.type === "reference" || e.type === "feedback");
|
||||
|
||||
// Source priority: explicit > manual > compaction
|
||||
// Same source: higher confidence wins
|
||||
// Build entity key dedup for project/reference/feedback
|
||||
const entityDeduped = new Map<string, LongTermMemoryEntry>();
|
||||
for (const entry of projectRefEntries) {
|
||||
const entityKey = entry.type === "project" || entry.type === "reference"
|
||||
? extractEntityKey(entry.text)
|
||||
: feedbackTopicKey(entry.text);
|
||||
const key = entityKey ? `${entry.type}:${entityKey}` : `${entry.type}:${canonicalMemoryText(entry.text)}`;
|
||||
|
||||
const existing = entityDeduped.get(key);
|
||||
if (!existing) {
|
||||
byKey.set(key, { ...entry, text });
|
||||
} else if (sourcePriority(entry.source) > sourcePriority(existing.source)) {
|
||||
byKey.set(key, { ...entry, text });
|
||||
} else if (sourcePriority(entry.source) === sourcePriority(existing.source)) {
|
||||
if (entry.confidence > existing.confidence) {
|
||||
byKey.set(key, { ...entry, text });
|
||||
entityDeduped.set(key, entry);
|
||||
} else {
|
||||
// Feedback topic conflicts use supersession mode (newer beats longer)
|
||||
const mode = entry.type === "feedback" && entityKey ? "supersession" as const : "entity" as const;
|
||||
if (chooseBetterMemory(entry, existing, mode) === entry) {
|
||||
entityDeduped.set(key, entry);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return [...byKey.values()]
|
||||
.sort((a, b) => priority(b) - priority(a))
|
||||
// For decisions: detect topic keys for supersession, or use canonical
|
||||
const decisionEntries = phase1.filter(e => e.type === "decision");
|
||||
const decisionDeduped = new Map<string, LongTermMemoryEntry>();
|
||||
for (const entry of decisionEntries) {
|
||||
const topic = decisionTopicKey(entry.text);
|
||||
const key = topic ? `decision:${topic}` : `decision:${canonicalMemoryText(entry.text)}`;
|
||||
|
||||
const existing = decisionDeduped.get(key);
|
||||
if (!existing) {
|
||||
decisionDeduped.set(key, entry);
|
||||
} else if (chooseBetterMemory(entry, existing, "supersession") === entry) {
|
||||
decisionDeduped.set(key, entry);
|
||||
}
|
||||
}
|
||||
|
||||
// Merge deduped entries
|
||||
const phaseFinal = new Map<string, LongTermMemoryEntry>();
|
||||
for (const entry of [...entityDeduped.values(), ...decisionDeduped.values()]) {
|
||||
phaseFinal.set(entry.id, entry);
|
||||
}
|
||||
|
||||
// Phase 6: sort and trim
|
||||
return [...phaseFinal.values()]
|
||||
.sort((a, b) => {
|
||||
const pA = priorityWithFreshness(a);
|
||||
const pB = priorityWithFreshness(b);
|
||||
if (pB !== pA) return pB - pA;
|
||||
const sourceDiff = sourcePriority(b.source) - sourcePriority(a.source);
|
||||
if (sourceDiff !== 0) return sourceDiff;
|
||||
return new Date(b.createdAt).getTime() - new Date(a.createdAt).getTime();
|
||||
})
|
||||
.slice(0, LONG_TERM_LIMITS.maxEntries);
|
||||
}
|
||||
|
||||
@@ -115,6 +254,11 @@ function priority(entry: LongTermMemoryEntry): number {
|
||||
return sourceWeight + typeWeight + entry.confidence * 10;
|
||||
}
|
||||
|
||||
/** Extended priority including freshness for tie-breaking */
|
||||
function priorityWithFreshness(entry: LongTermMemoryEntry): number {
|
||||
return priority(entry);
|
||||
}
|
||||
|
||||
function wouldFit(
|
||||
lines: string[],
|
||||
nextLine: string,
|
||||
@@ -136,10 +280,8 @@ export function renderWorkspaceMemory(store: WorkspaceMemoryStore): string {
|
||||
// If maxChars smaller than minimum envelope, return empty string
|
||||
if (maxChars < MIN_ENVELOPE_LENGTH) return "";
|
||||
|
||||
const closing = "</workspace_memory>";
|
||||
const lines: string[] = [
|
||||
"<workspace_memory>",
|
||||
"Persistent workspace memory. Use as background; verify stale or code-related claims.",
|
||||
"Workspace memory (cross-session, verify if stale):",
|
||||
];
|
||||
|
||||
for (const type of ["feedback", "project", "decision", "reference"] as const) {
|
||||
@@ -150,17 +292,16 @@ export function renderWorkspaceMemory(store: WorkspaceMemoryStore): string {
|
||||
|
||||
for (const item of items) {
|
||||
const line = `- ${renderEntry(item)}`;
|
||||
if (wouldFit([...lines, ...sectionLines], line, closing, maxChars)) {
|
||||
if ([...lines, ...sectionLines, line].join("\n").length <= maxChars) {
|
||||
sectionLines.push(line);
|
||||
}
|
||||
}
|
||||
|
||||
if (sectionLines.length > 1 && wouldFit(lines, sectionLines[0], closing, maxChars)) {
|
||||
if (sectionLines.length > 1) {
|
||||
lines.push(...sectionLines);
|
||||
}
|
||||
}
|
||||
|
||||
lines.push(closing);
|
||||
return lines.join("\n");
|
||||
}
|
||||
|
||||
|
||||
+164
-16
@@ -133,9 +133,8 @@ import { parseWorkspaceMemoryCandidates } from "../src/extractors.ts";
|
||||
|
||||
test("parseWorkspaceMemoryCandidates rejects short text", () => {
|
||||
const summary = `
|
||||
<workspace_memory_candidates>
|
||||
## Memory Candidates
|
||||
- [decision] short text
|
||||
</workspace_memory_candidates>
|
||||
`;
|
||||
const items = parseWorkspaceMemoryCandidates(summary);
|
||||
assert.equal(items.length, 0);
|
||||
@@ -143,9 +142,8 @@ test("parseWorkspaceMemoryCandidates rejects short text", () => {
|
||||
|
||||
test("parseWorkspaceMemoryCandidates rejects git commit hash", () => {
|
||||
const summary = `
|
||||
<workspace_memory_candidates>
|
||||
## Memory Candidates
|
||||
- [project] abc123def456 is the commit hash
|
||||
</workspace_memory_candidates>
|
||||
`;
|
||||
const items = parseWorkspaceMemoryCandidates(summary);
|
||||
assert.equal(items.length, 0);
|
||||
@@ -153,9 +151,8 @@ test("parseWorkspaceMemoryCandidates rejects git commit hash", () => {
|
||||
|
||||
test("parseWorkspaceMemoryCandidates rejects raw error", () => {
|
||||
const summary = `
|
||||
<workspace_memory_candidates>
|
||||
## Memory Candidates
|
||||
- [feedback] TypeError: Cannot read property 'x' of undefined
|
||||
</workspace_memory_candidates>
|
||||
`;
|
||||
const items = parseWorkspaceMemoryCandidates(summary);
|
||||
assert.equal(items.length, 0);
|
||||
@@ -163,9 +160,8 @@ test("parseWorkspaceMemoryCandidates rejects raw error", () => {
|
||||
|
||||
test("parseWorkspaceMemoryCandidates rejects stack trace", () => {
|
||||
const summary = `
|
||||
<workspace_memory_candidates>
|
||||
## Memory Candidates
|
||||
- [reference] at foo (bar.ts:10:5)
|
||||
</workspace_memory_candidates>
|
||||
`;
|
||||
const items = parseWorkspaceMemoryCandidates(summary);
|
||||
assert.equal(items.length, 0);
|
||||
@@ -173,9 +169,8 @@ test("parseWorkspaceMemoryCandidates rejects stack trace", () => {
|
||||
|
||||
test("parseWorkspaceMemoryCandidates rejects commit prefix", () => {
|
||||
const summary = `
|
||||
<workspace_memory_candidates>
|
||||
## Memory Candidates
|
||||
- [project] fix: add new feature
|
||||
</workspace_memory_candidates>
|
||||
`;
|
||||
const items = parseWorkspaceMemoryCandidates(summary);
|
||||
assert.equal(items.length, 0);
|
||||
@@ -183,9 +178,8 @@ test("parseWorkspaceMemoryCandidates rejects commit prefix", () => {
|
||||
|
||||
test("parseWorkspaceMemoryCandidates rejects path-heavy facts", () => {
|
||||
const summary = `
|
||||
<workspace_memory_candidates>
|
||||
## Memory Candidates
|
||||
- [project] files at /src/a.ts /src/b.ts /src/c.ts are important
|
||||
</workspace_memory_candidates>
|
||||
`;
|
||||
const items = parseWorkspaceMemoryCandidates(summary);
|
||||
assert.equal(items.length, 0);
|
||||
@@ -193,9 +187,8 @@ test("parseWorkspaceMemoryCandidates rejects path-heavy facts", () => {
|
||||
|
||||
test("parseWorkspaceMemoryCandidates accepts valid decision", () => {
|
||||
const summary = `
|
||||
<workspace_memory_candidates>
|
||||
## Memory Candidates
|
||||
- [decision] Use pnpm instead of npm for package management
|
||||
</workspace_memory_candidates>
|
||||
`;
|
||||
const items = parseWorkspaceMemoryCandidates(summary);
|
||||
assert.equal(items.length, 1);
|
||||
@@ -205,11 +198,166 @@ test("parseWorkspaceMemoryCandidates accepts valid decision", () => {
|
||||
|
||||
test("parseWorkspaceMemoryCandidates accepts valid project info", () => {
|
||||
const summary = `
|
||||
<workspace_memory_candidates>
|
||||
## Memory Candidates
|
||||
- [project] This project uses TypeScript for all source files
|
||||
</workspace_memory_candidates>
|
||||
`;
|
||||
const items = parseWorkspaceMemoryCandidates(summary);
|
||||
assert.equal(items.length, 1);
|
||||
assert.equal(items[0].type, "project");
|
||||
});
|
||||
|
||||
test("parseWorkspaceMemoryCandidates accepts plain text label format (no Markdown)", () => {
|
||||
const summary = `
|
||||
Memory candidates:
|
||||
- [decision] Use plain text labels to avoid purple Markdown headers
|
||||
- [project] This repo uses pnpm for package management
|
||||
`;
|
||||
const items = parseWorkspaceMemoryCandidates(summary);
|
||||
assert.equal(items.length, 2);
|
||||
assert.equal(items[0].type, "decision");
|
||||
assert.equal(items[1].type, "project");
|
||||
});
|
||||
|
||||
test("parseWorkspaceMemoryCandidates accepts bracketless candidate format", () => {
|
||||
const summary = `
|
||||
Memory candidates:
|
||||
- project Backend health improvements organized into phased milestones
|
||||
- reference Scrypt 參數必須是 N=16384, r=8, p=1
|
||||
- feedback 端口 9473 可能被舊進程佔用,需殺掉後重啟
|
||||
- decision Use output.prompt to replace the default compaction template
|
||||
`;
|
||||
|
||||
const items = parseWorkspaceMemoryCandidates(summary);
|
||||
|
||||
assert.equal(items.length, 4, "Should parse all 4 bracketless candidates");
|
||||
assert.deepEqual(items.map(i => i.type), [
|
||||
"project",
|
||||
"reference",
|
||||
"feedback",
|
||||
"decision",
|
||||
]);
|
||||
});
|
||||
|
||||
test("parseWorkspaceMemoryCandidates rejects unknown bracketless candidate type", () => {
|
||||
const summary = `
|
||||
Memory candidates:
|
||||
- note this should not be parsed as memory
|
||||
`;
|
||||
|
||||
const items = parseWorkspaceMemoryCandidates(summary);
|
||||
|
||||
assert.equal(items.length, 0);
|
||||
});
|
||||
|
||||
test("parseWorkspaceMemoryCandidates rejects bracketless very short body", () => {
|
||||
const summary = `
|
||||
Memory candidates:
|
||||
- project short
|
||||
`;
|
||||
|
||||
const items = parseWorkspaceMemoryCandidates(summary);
|
||||
assert.equal(items.length, 0);
|
||||
});
|
||||
|
||||
test("parseWorkspaceMemoryCandidates does not match bracketless type as substring", () => {
|
||||
// "projectile" should NOT match "project"
|
||||
const summary = `
|
||||
Memory candidates:
|
||||
- projectile launcher should not be parsed as a project memory
|
||||
`;
|
||||
|
||||
const items = parseWorkspaceMemoryCandidates(summary);
|
||||
assert.equal(items.length, 0);
|
||||
});
|
||||
|
||||
test("parseWorkspaceMemoryCandidates rejects exact test count snapshots", () => {
|
||||
const summary = `
|
||||
Memory candidates:
|
||||
- project 1237 tests pass, 226 suites
|
||||
- project 500 tests pass today
|
||||
`;
|
||||
|
||||
const items = parseWorkspaceMemoryCandidates(summary);
|
||||
assert.equal(items.length, 0, "Exact test counts are session snapshots, not durable memory");
|
||||
});
|
||||
|
||||
test("parseWorkspaceMemoryCandidates rejects exact file count snapshots", () => {
|
||||
const summary = `
|
||||
Memory candidates:
|
||||
- project USB 同步 37 個文件
|
||||
- project 42 files synced
|
||||
`;
|
||||
|
||||
const items = parseWorkspaceMemoryCandidates(summary);
|
||||
assert.equal(items.length, 0, "Exact file counts are session snapshots");
|
||||
});
|
||||
|
||||
test("parseWorkspaceMemoryCandidates rejects phase progress snapshots", () => {
|
||||
const summary = `
|
||||
Memory candidates:
|
||||
- project Phase 1-4 已完成
|
||||
- project Phase 3 completed
|
||||
- project Completed phase 1
|
||||
`;
|
||||
|
||||
const items = parseWorkspaceMemoryCandidates(summary);
|
||||
assert.equal(items.length, 0, "Phase progress is session snapshot, not durable milestone");
|
||||
});
|
||||
|
||||
test("parseWorkspaceMemoryCandidates accepts durable project facts", () => {
|
||||
const summary = `
|
||||
Memory candidates:
|
||||
- project Backend health improvements organized into phased milestones
|
||||
- project USB sync covers bundles, server, frontend, tests, and docs
|
||||
- project Test suite expected to pass before handoff
|
||||
`;
|
||||
|
||||
const items = parseWorkspaceMemoryCandidates(summary);
|
||||
assert.equal(items.length, 3, "Durable project facts should pass");
|
||||
});
|
||||
|
||||
test("parseWorkspaceMemoryCandidates accepts short Admin PIN reference entry", () => {
|
||||
// Real Admin PIN is <20 chars — should pass via config value allowlist
|
||||
const summary = `
|
||||
Memory candidates:
|
||||
- reference Admin PIN 是 456123
|
||||
`;
|
||||
|
||||
const items = parseWorkspaceMemoryCandidates(summary);
|
||||
assert.equal(items.length, 1, "Short config reference should pass via allowlist");
|
||||
assert.equal(items[0].type, "reference");
|
||||
});
|
||||
|
||||
test("parseWorkspaceMemoryCandidates accepts Scrypt config reference", () => {
|
||||
// Scrypt parameters with numbers should pass
|
||||
const summary = `
|
||||
Memory candidates:
|
||||
- reference Scrypt 參數必須是 N=16384, r=8, p=1
|
||||
`;
|
||||
|
||||
const items = parseWorkspaceMemoryCandidates(summary);
|
||||
assert.equal(items.length, 1, "Scrypt config values should pass");
|
||||
assert.equal(items[0].type, "reference");
|
||||
});
|
||||
|
||||
test("parseWorkspaceMemoryCandidates rejects Chinese file count snapshot", () => {
|
||||
// Real Chinese file count with counter word 個
|
||||
const summary = `
|
||||
Memory candidates:
|
||||
- project USB 同步:37 個文件(bundles, server, frontend, tests, docs)
|
||||
`;
|
||||
|
||||
const items = parseWorkspaceMemoryCandidates(summary);
|
||||
assert.equal(items.length, 0, "Chinese file count with 個 should be rejected");
|
||||
});
|
||||
|
||||
test("parseWorkspaceMemoryCandidates rejects real phase snapshot mid-description", () => {
|
||||
// Real phase snapshot where Phase appears deep in the string
|
||||
const summary = `
|
||||
Memory candidates:
|
||||
- project pathology-playground 後端健康改進計劃已完成 Phase 1-4
|
||||
`;
|
||||
|
||||
const items = parseWorkspaceMemoryCandidates(summary);
|
||||
assert.equal(items.length, 0, "Phase snapshot mid-description should still be rejected");
|
||||
});
|
||||
@@ -5,6 +5,7 @@ import { tmpdir } from "node:os";
|
||||
import { join } from "node:path";
|
||||
import { MemoryV2Plugin } from "../src/plugin.ts";
|
||||
import { loadSessionState, saveSessionState } from "../src/session-state.ts";
|
||||
import { parseWorkspaceMemoryCandidates } from "../src/extractors.ts";
|
||||
import type { OpenError } from "../src/types.ts";
|
||||
|
||||
// Mock client for root session (not a sub-agent)
|
||||
@@ -192,4 +193,156 @@ test("tool.execute.after: exitCode non-zero creates open error", async () => {
|
||||
} finally {
|
||||
await rm(tmpDir, { recursive: true, force: true });
|
||||
}
|
||||
});
|
||||
|
||||
test("compaction hook sets output.prompt with ---free template", async () => {
|
||||
const tmpDir = await mkdtemp(join(tmpdir(), "memory-plugin-test-"));
|
||||
|
||||
try {
|
||||
const client = mockRootClient();
|
||||
const plugin = await MemoryV2Plugin({ directory: tmpDir, client });
|
||||
|
||||
// Create a session state with some data
|
||||
await saveSessionState(tmpDir, {
|
||||
version: 1,
|
||||
sessionID: "test-session-compaction",
|
||||
turn: 1,
|
||||
updatedAt: new Date().toISOString(),
|
||||
activeFiles: [{ path: "/src/index.ts", action: "edit", count: 5, lastSeen: Date.now() }],
|
||||
openErrors: [],
|
||||
recentDecisions: [{ text: "Test decision", rationale: "Testing", source: "user", createdAt: Date.now() }],
|
||||
});
|
||||
|
||||
// Call the compaction hook
|
||||
const output = { context: [] as string[] };
|
||||
await (plugin as Record<string, Function>)["experimental.session.compacting"](
|
||||
{ sessionID: "test-session-compaction" },
|
||||
output
|
||||
);
|
||||
|
||||
// Should set output.prompt and clear output.context
|
||||
const prompt = (output as Record<string, unknown>).prompt as string | undefined;
|
||||
assert.ok(prompt, "output.prompt should be set");
|
||||
assert.equal(typeof prompt, "string", "output.prompt should be a string");
|
||||
assert.equal(output.context.length, 0, "output.context should be cleared after setting prompt");
|
||||
|
||||
// Should NOT contain YAML frontmatter separators (--- at start)
|
||||
assert.equal(prompt!.includes("\n---"), false,
|
||||
"Prompt should not contain --- separators on their own line");
|
||||
|
||||
// Should NOT contain XML-like tags
|
||||
assert.equal(prompt!.includes("<workspace_memory>"), false);
|
||||
assert.equal(prompt!.includes("</workspace_memory>"), false);
|
||||
assert.equal(prompt!.includes("<hot_session_state>"), false);
|
||||
assert.equal(prompt!.includes("<pending_todos>"), false);
|
||||
|
||||
// Should NOT contain HTML comments
|
||||
assert.equal(prompt!.includes("<!--"), false);
|
||||
|
||||
// Should contain the ---free template heading
|
||||
assert.equal(prompt!.includes("## Goal"), true,
|
||||
"Prompt should use ## Goal heading, not --- separators");
|
||||
|
||||
// Should contain formatting rules that explicitly forbid ---
|
||||
assert.equal(prompt!.includes("Do not output YAML frontmatter"), true,
|
||||
"Prompt should explicitly forbid YAML frontmatter");
|
||||
assert.equal(prompt!.includes("horizontal rules"), true,
|
||||
"Prompt should explicitly forbid horizontal rules");
|
||||
|
||||
// Should contain Memory candidates format
|
||||
assert.equal(prompt!.includes("Memory candidates:"), true,
|
||||
"Prompt should include Memory candidates: label");
|
||||
|
||||
// Should contain our context data (hot session state)
|
||||
assert.equal(prompt!.includes("Hot session state"), true,
|
||||
"Prompt should include hot session state context");
|
||||
|
||||
// Verify: prompt starts with plain text, not a markup delimiter
|
||||
assert.equal(prompt!.startsWith("---"), false,
|
||||
"Prompt should not start with --- (YAML frontmatter)");
|
||||
assert.equal(prompt!.startsWith("##"), false,
|
||||
"Prompt should start with plain instructions, not a heading");
|
||||
|
||||
} finally {
|
||||
await rm(tmpDir, { recursive: true, force: true });
|
||||
}
|
||||
});
|
||||
|
||||
test("compaction hook merges existing output.context from other plugins", async () => {
|
||||
const tmpDir = await mkdtemp(join(tmpdir(), "memory-plugin-test-"));
|
||||
|
||||
try {
|
||||
const client = mockRootClient();
|
||||
const plugin = await MemoryV2Plugin({ directory: tmpDir, client });
|
||||
|
||||
// Simulate another plugin having pushed context first
|
||||
const output = { context: ["Other plugin context data"] };
|
||||
await (plugin as Record<string, Function>)["experimental.session.compacting"](
|
||||
{ sessionID: "test-merge-context" },
|
||||
output
|
||||
);
|
||||
|
||||
const prompt = (output as Record<string, unknown>).prompt as string | undefined;
|
||||
assert.ok(prompt, "output.prompt should be set");
|
||||
assert.equal(output.context.length, 0, "output.context should be cleared");
|
||||
|
||||
// Should contain the other plugin's context
|
||||
assert.equal(prompt!.includes("Other plugin context data"), true,
|
||||
"Prompt should preserve context from other plugins");
|
||||
|
||||
} finally {
|
||||
await rm(tmpDir, { recursive: true, force: true });
|
||||
}
|
||||
});
|
||||
|
||||
test("parseWorkspaceMemoryCandidates accepts Markdown section format", async () => {
|
||||
const summary = `
|
||||
## Summary
|
||||
Progress made on testing.
|
||||
|
||||
## Memory Candidates
|
||||
- [decision] Use Markdown sections for candidates
|
||||
- [project] This repo uses Markdown for docs
|
||||
|
||||
Next steps: continue development.
|
||||
`;
|
||||
|
||||
const candidates = parseWorkspaceMemoryCandidates(summary);
|
||||
assert.equal(candidates.length, 2, "Should parse Markdown section format");
|
||||
assert.equal(candidates[0].type, "decision");
|
||||
assert.equal(candidates[1].type, "project");
|
||||
});
|
||||
|
||||
test("parseWorkspaceMemoryCandidates accepts legacy Workspace Memory Candidates section", async () => {
|
||||
const summary = `
|
||||
## Summary
|
||||
Progress made on testing.
|
||||
|
||||
## Workspace Memory Candidates
|
||||
- [reference] Check docs at README.md
|
||||
|
||||
## Next Steps
|
||||
Continue development.
|
||||
`;
|
||||
|
||||
const candidates = parseWorkspaceMemoryCandidates(summary);
|
||||
assert.equal(candidates.length, 1, "Should parse legacy section format");
|
||||
assert.equal(candidates[0].type, "reference");
|
||||
});
|
||||
|
||||
test("parseWorkspaceMemoryCandidates still accepts legacy XML format", async () => {
|
||||
const summary = `
|
||||
## Summary
|
||||
Progress made on testing.
|
||||
|
||||
<workspace_memory_candidates>
|
||||
- [feedback] Users prefer darker themes
|
||||
</workspace_memory_candidates>
|
||||
|
||||
Next steps: continue development.
|
||||
`;
|
||||
|
||||
const candidates = parseWorkspaceMemoryCandidates(summary);
|
||||
assert.equal(candidates.length, 1, "Should parse legacy XML format");
|
||||
assert.equal(candidates[0].type, "feedback");
|
||||
});
|
||||
@@ -17,11 +17,32 @@ function entry(id: string, text: string, type: LongTermMemoryEntry["type"] = "de
|
||||
};
|
||||
}
|
||||
|
||||
/** Create an entry with a createdAt offset from now (negative = in the past) */
|
||||
function agedEntry(
|
||||
id: string,
|
||||
text: string,
|
||||
type: LongTermMemoryEntry["type"] = "decision",
|
||||
opts: { daysAgo: number; source?: "compaction" | "explicit" | "manual"; staleAfterDays?: number } = { daysAgo: 0, source: "compaction" },
|
||||
): LongTermMemoryEntry {
|
||||
const createdAt = new Date(Date.now() - opts.daysAgo * 86400000).toISOString();
|
||||
return {
|
||||
id,
|
||||
type,
|
||||
text,
|
||||
source: opts.source ?? "compaction",
|
||||
confidence: 0.75,
|
||||
status: "active",
|
||||
createdAt,
|
||||
updatedAt: createdAt,
|
||||
staleAfterDays: opts.staleAfterDays,
|
||||
};
|
||||
}
|
||||
|
||||
// ============================================
|
||||
// Task 2: renderWorkspaceMemory tests
|
||||
// ============================================
|
||||
|
||||
test("renderWorkspaceMemory never truncates closing XML tag", () => {
|
||||
test("renderWorkspaceMemory respects budget and fits entries", () => {
|
||||
const entries = Array.from({ length: 28 }, (_, i) =>
|
||||
entry(`mem_${i}`, `Long durable memory entry ${i} `.repeat(20))
|
||||
);
|
||||
@@ -36,8 +57,8 @@ test("renderWorkspaceMemory never truncates closing XML tag", () => {
|
||||
|
||||
const rendered = renderWorkspaceMemory(store);
|
||||
|
||||
assert.ok(rendered.endsWith("</workspace_memory>"),
|
||||
`Rendered memory must end with closing tag. Got: ...${rendered.slice(-50)}`);
|
||||
assert.ok(!rendered.includes("<workspace_memory>"),
|
||||
"Should not contain XML tags");
|
||||
assert.ok(rendered.length <= 700,
|
||||
`Rendered memory must not exceed maxChars. Got: ${rendered.length}`);
|
||||
});
|
||||
@@ -56,7 +77,7 @@ test("renderWorkspaceMemory returns empty string when maxChars too small", () =>
|
||||
"When maxChars too small for even minimal envelope, return empty string");
|
||||
});
|
||||
|
||||
test("renderWorkspaceMemory respects budget and fits entries", () => {
|
||||
test("renderWorkspaceMemory respects small budget", () => {
|
||||
// Create entries that would overflow a small budget
|
||||
const entries = [
|
||||
entry("a", "First memory entry that is reasonably long"),
|
||||
@@ -74,8 +95,8 @@ test("renderWorkspaceMemory respects budget and fits entries", () => {
|
||||
|
||||
const rendered = renderWorkspaceMemory(store);
|
||||
|
||||
assert.ok(rendered.endsWith("</workspace_memory>"),
|
||||
"Must end with closing tag even when truncating entries");
|
||||
assert.ok(!rendered.includes("<workspace_memory>"),
|
||||
"Should not contain XML tags");
|
||||
assert.ok(rendered.length <= 200,
|
||||
`Must respect maxChars limit. Got: ${rendered.length}`);
|
||||
});
|
||||
@@ -207,4 +228,189 @@ test("enforceLongTermLimits respects maxEntries limit", () => {
|
||||
|
||||
const kept = enforceLongTermLimits(entries);
|
||||
assert.ok(kept.length <= 28, `Should respect maxEntries. Got: ${kept.length}`);
|
||||
});
|
||||
|
||||
// ============================================
|
||||
// P0d: identity-key dedup, supersession, staleness
|
||||
// ============================================
|
||||
|
||||
test("enforceLongTermLimits project: bilingual variants collapse to one", () => {
|
||||
// All three mention opencode-agenthub plugin system - should merge
|
||||
const entries = [
|
||||
agedEntry("p1", "此 repo 在開發時使用 opencode-agenthub 插件系統,目錄位於 /Users/sd_wo/work/opencode-working-memory/.opencode-agenthub/", "project", { daysAgo: 2 }),
|
||||
agedEntry("p2", "此 repo 在開發時使用 opencode-agenthub 插件系統", "project", { daysAgo: 1 }),
|
||||
agedEntry("p3", "This repo uses opencode-agenthub plugin system at /Users/sd_wo/work/opencode-working-memory/", "project", { daysAgo: 0 }),
|
||||
];
|
||||
|
||||
const kept = enforceLongTermLimits(entries);
|
||||
const projectEntries = kept.filter(e => e.type === "project");
|
||||
assert.equal(projectEntries.length, 1, "All three project variants should merge to one");
|
||||
});
|
||||
|
||||
test("enforceLongTermLimits reference: same config path variants collapse to one", () => {
|
||||
const entries = [
|
||||
agedEntry("r1", "OpenCode plugin config location: .opencode-agenthub/current/xdg/opencode/opencode.json in workspace", "reference", { daysAgo: 1 }),
|
||||
agedEntry("r2", "OpenCode plugin config: .opencode-agenthub/current/xdg/opencode/opencode.json in workspace", "reference", { daysAgo: 0 }),
|
||||
];
|
||||
|
||||
const kept = enforceLongTermLimits(entries);
|
||||
const refEntries = kept.filter(e => e.type === "reference");
|
||||
assert.equal(refEntries.length, 1, "Both reference variants should merge to one");
|
||||
});
|
||||
|
||||
test("enforceLongTermLimits decision: newer supersedes older on same topic", () => {
|
||||
// "4 formats" supersedes "3 formats" on the same parser topic
|
||||
const entries = [
|
||||
agedEntry("d1", "Parser supports 3 formats: HTML comment, Markdown section, legacy XML", "decision", { daysAgo: 2 }),
|
||||
agedEntry("d2", "Parser supports 4 formats: plain text label, Markdown section, legacy section name, legacy XML", "decision", { daysAgo: 0 }),
|
||||
];
|
||||
|
||||
const kept = enforceLongTermLimits(entries);
|
||||
const decisionEntries = kept.filter(e => e.text.includes("formats"));
|
||||
assert.equal(decisionEntries.length, 1, "Newer 4-formats should supersede older 3-formats");
|
||||
assert.ok(decisionEntries[0].text.includes("4 formats"), "Kept entry should be the 4-formats one");
|
||||
});
|
||||
|
||||
test("enforceLongTermLimits feedback: newer supersedes older on same issue", () => {
|
||||
const entries = [
|
||||
agedEntry("f1", "Purple/italic text issue resolved by using plain text labels instead of any special markup syntax", "feedback", { daysAgo: 2 }),
|
||||
agedEntry("f2", "Purple/italic text issue resolved by replacing default compaction template with ---free version using only Markdown headings", "feedback", { daysAgo: 0 }),
|
||||
];
|
||||
|
||||
const kept = enforceLongTermLimits(entries);
|
||||
const feedbackEntries = kept.filter(e => e.type === "feedback");
|
||||
assert.equal(feedbackEntries.length, 1, "Newer purple/italic fix should supersede older");
|
||||
assert.ok(feedbackEntries[0].text.includes("replacing default compaction template"), "Kept entry should be the newer fix");
|
||||
});
|
||||
|
||||
test("enforceLongTermLimits stale: compaction entry older than staleAfterDays+grace is pruned", () => {
|
||||
// decision with staleAfterDays=45, 76 days old (> 45+30 grace=75)
|
||||
const entries = [
|
||||
agedEntry("stale", "Compaction output contract changed from XML to HTML comments to avoid Markdown rendering issues", "decision", { daysAgo: 76, staleAfterDays: 45 }),
|
||||
];
|
||||
|
||||
const kept = enforceLongTermLimits(entries);
|
||||
assert.equal(kept.length, 0, "Stale compaction entry should be pruned");
|
||||
});
|
||||
|
||||
test("enforceLongTermLimits stale: explicit entry is retained even if old", () => {
|
||||
// explicit entry - never auto-pruned regardless of age
|
||||
const entries = [
|
||||
agedEntry("old_explicit", "User explicitly set Admin PIN 456123 for the system", "reference", { daysAgo: 500, source: "explicit", staleAfterDays: 90 }),
|
||||
];
|
||||
|
||||
const kept = enforceLongTermLimits(entries);
|
||||
assert.equal(kept.length, 1, "Explicit entry should never be age-pruned");
|
||||
});
|
||||
|
||||
test("enforceLongTermLimits stale: feedback entry is retained regardless of age", () => {
|
||||
// feedback - never age-pruned (only superseded)
|
||||
const entries = [
|
||||
agedEntry("old_feedback", "Users prefer darker themes over light themes", "feedback", { daysAgo: 300, staleAfterDays: 30 }),
|
||||
];
|
||||
|
||||
const kept = enforceLongTermLimits(entries);
|
||||
assert.equal(kept.length, 1, "Feedback entry should never be age-pruned");
|
||||
});
|
||||
|
||||
test("enforceLongTermLimits stale: compaction entry within grace period is retained", () => {
|
||||
// decision staleAfterDays=45, 60 days old (< 45+30=75 grace) - should keep
|
||||
const entries = [
|
||||
agedEntry("within_grace", "Some compaction decision made two months ago", "decision", { daysAgo: 60, staleAfterDays: 45 }),
|
||||
];
|
||||
|
||||
const kept = enforceLongTermLimits(entries);
|
||||
assert.equal(kept.length, 1, "Entry within grace period should be retained");
|
||||
});
|
||||
|
||||
test("enforceLongTermLimits dedup before trim: cleanup runs before maxEntries slice", () => {
|
||||
// 30 entries that should dedupe to < 28, confirming trim doesn't run before dedupe
|
||||
const entries = [
|
||||
...Array.from({ length: 15 }, (_, i) =>
|
||||
agedEntry(`a${i}`, "opencode uses npm cache for plugin loading", "decision", { daysAgo: 0 })
|
||||
),
|
||||
...Array.from({ length: 15 }, (_, i) =>
|
||||
agedEntry(`b${i}`, "opencode uses npm cache for plugin loading", "decision", { daysAgo: 0 })
|
||||
),
|
||||
];
|
||||
|
||||
const kept = enforceLongTermLimits(entries);
|
||||
assert.equal(kept.length, 1, "All duplicates should merge to 1 entry, far below maxEntries");
|
||||
});
|
||||
|
||||
test("enforceLongTermLimits priority: freshness used as tie-breaker among same priority entries", () => {
|
||||
// Same type, same source, same confidence — newer should win
|
||||
const older = agedEntry("older", "Some durable configuration fact about the workspace", "reference", { daysAgo: 30, source: "compaction", staleAfterDays: 90 });
|
||||
const newer = agedEntry("newer", "Some durable configuration fact about the workspace", "reference", { daysAgo: 5, source: "compaction", staleAfterDays: 90 });
|
||||
|
||||
const kept = enforceLongTermLimits([older, newer]);
|
||||
assert.equal(kept.length, 1);
|
||||
assert.equal(kept[0].id, "newer", "Newer entry should win as tie-breaker");
|
||||
});
|
||||
|
||||
test("enforceLongTermLimits feedback: 500 error and port issue are NOT collapsed", () => {
|
||||
// Distinct feedback entries should remain separate
|
||||
const entries = [
|
||||
agedEntry("f1", "瀏覽器登入出現 500 internal_error,代碼邏輯正確但原因不明", "feedback", { daysAgo: 0 }),
|
||||
agedEntry("f2", "端口 9473 可能被舊進程佔用,需殺掉後重啟", "feedback", { daysAgo: 0 }),
|
||||
];
|
||||
|
||||
const kept = enforceLongTermLimits(entries);
|
||||
const feedbackEntries = kept.filter(e => e.type === "feedback");
|
||||
assert.equal(feedbackEntries.length, 2, "Distinct feedback items should not collapse");
|
||||
});
|
||||
|
||||
test("enforceLongTermLimits config: unrelated plugin configs are NOT collapsed", () => {
|
||||
const entries = [
|
||||
agedEntry("c1", "OpenCode plugin config: .opencode-agenthub/current/xdg/opencode/opencode.json in workspace", "reference", { daysAgo: 0 }),
|
||||
agedEntry("c2", "Vite plugin config location: vite.config.ts at project root", "reference", { daysAgo: 0 }),
|
||||
];
|
||||
|
||||
const kept = enforceLongTermLimits(entries);
|
||||
const refEntries = kept.filter(e => e.type === "reference");
|
||||
assert.equal(refEntries.length, 2, "Unrelated plugin configs should remain separate");
|
||||
});
|
||||
|
||||
test("enforceLongTermLimits supersession: newer shorter decision beats older longer one", () => {
|
||||
// Same topic, same source, same confidence — newer wins even if shorter
|
||||
const older = agedEntry("d1", "Parser supports 3 formats: HTML comment, Markdown section, legacy XML with backward compatibility", "decision", { daysAgo: 5 });
|
||||
const newer = agedEntry("d2", "Parser supports 4 formats", "decision", { daysAgo: 0 });
|
||||
|
||||
const kept = enforceLongTermLimits([older, newer]);
|
||||
const decisions = kept.filter(e => e.type === "decision" && /parser.*format/i.test(e.text));
|
||||
assert.equal(decisions.length, 1, "Newer shorter decision should supersede older longer one");
|
||||
assert.ok(decisions[0].text.includes("4 formats"), "Kept entry should be the newer 4-formats");
|
||||
});
|
||||
|
||||
test("enforceLongTermLimits feedback: English port issue does NOT collapse with server error", () => {
|
||||
const entries = [
|
||||
agedEntry("e1", "Browser login 500 internal_error, code correct but cause unknown", "feedback", { daysAgo: 0 }),
|
||||
agedEntry("e2", "Port 9473 occupied by old process, may need to kill and restart", "feedback", { daysAgo: 0 }),
|
||||
];
|
||||
|
||||
const kept = enforceLongTermLimits(entries);
|
||||
const feedbackEntries = kept.filter(e => e.type === "feedback");
|
||||
assert.equal(feedbackEntries.length, 2, "English port issue and server error should remain separate");
|
||||
});
|
||||
|
||||
test("enforceLongTermLimits config: unrelated generic plugin configs do NOT collapse", () => {
|
||||
const entries = [
|
||||
agedEntry("c1", "Vite plugin config location: vite.config.ts at project root", "reference", { daysAgo: 0 }),
|
||||
agedEntry("c2", "ESLint plugin config location: eslint.config.js at project root", "reference", { daysAgo: 0 }),
|
||||
];
|
||||
|
||||
const kept = enforceLongTermLimits(entries);
|
||||
const refEntries = kept.filter(e => e.type === "reference");
|
||||
assert.equal(refEntries.length, 2, "Unrelated plugin configs without entity key should remain separate");
|
||||
});
|
||||
|
||||
test("enforceLongTermLimits feedback: supersession prefers newer shorter over older longer", () => {
|
||||
// Same purple/italic issue, newer shorter fix supersedes older verbose fix
|
||||
const older = agedEntry("f1", "Purple/italic text issue resolved by using plain text labels instead of any special markup syntax in the prompt", "feedback", { daysAgo: 5 });
|
||||
const newer = agedEntry("f2", "Purple/italic text fixed via template replacement", "feedback", { daysAgo: 0 });
|
||||
|
||||
const kept = enforceLongTermLimits([older, newer]);
|
||||
const feedbackEntries = kept.filter(e => e.type === "feedback");
|
||||
assert.equal(feedbackEntries.length, 1, "Newer shorter feedback should supersede older longer");
|
||||
assert.ok(feedbackEntries[0].text.includes("template replacement"), "Kept entry should be the newer fix");
|
||||
});
|
||||
Reference in New Issue
Block a user