Files
opencode-kimi-full/AGENTS.md
T
lemon07r 6549be2f3d Release v1.1.0: auto-discover model id + context length from /coding/v1/models
Mirrors kimi-cli's refresh_managed_models behavior: at login (and on every
token refresh) the plugin now GETs /coding/v1/models with the user's JWT,
caches the first returned {id, context_length, display_name} in auth.json,
and rewrites the wire 'model' body field to the cached id inside
loader.fetch. K2.5 accounts (server returns e.g. 'k2p5') and K2.6 accounts
(server returns 'kimi-for-coding') now share identical opencode config.

After successful login, the authorize callback prints a ready-to-paste
provider config block with the discovered values filled in.

- src/oauth.ts: added listModels() (GET /coding/v1/models).
- src/index.ts: OAuthAuth extended with model_id/context_length/model_display;
  ensureFresh() refetches on refresh; loader.fetch rewrites JSON body
  'model' when cached id differs from MODEL_ID; authorize callback runs
  listModels() + console.logs a ready-to-paste config block.
- README.md: drop hardcoded context/output limits and K2.6-specific name;
  explain auto-discovery and the K2.5/K2.6 alias mechanism.
- AGENTS.md: rule 6 rewritten to describe the wire-rewrite contract.
- test/plugin.test.ts: +3 tests (discovery on refresh persists metadata,
  graceful /models failure, body.model rewrite K2.5 + K2.6 no-op);
  existing 401-retry and refresh tests updated for the extra /models call.

Tested live against a real K2.6 account — listModels() returns
[{id: 'kimi-for-coding', context_length: 262144, ...}].
2026-04-17 04:53:27 -04:00

12 KiB
Raw Blame History

AGENTS.md — working notes for coding agents (and humans)

This file is the single source of truth for any AI agent (or human) modifying this repo. Read it top-to-bottom before touching code. If something you learn here contradicts what you see in the code, the code wins — update this file in the same commit.

User-facing install / usage documentation lives in README.md. Do not duplicate it here.


Purpose

One plugin, one job: make opencode talk to Kimi's kimi-for-coding endpoint exactly the way the official kimi-cli does. Everything in this repo exists to minimize drift from upstream kimi-cli.

The one rule that matters

Moonshot's backend picks the model (K2.5 vs K2.6) from the auth token type, not the model-name string.

  • Static sk-kimi-... API key → K2.5.
  • OAuth JWT with scope: kimi-code → K2.6.

Every design decision here follows from that: we do device-flow OAuth, we do not accept API keys, we do not let the upstream SDK attach its own Authorization header.

Non-goals

  • No support for K2.5 or any non-kimi-for-coding model. opencode already handles those via Moonshot / Baseten / Alibaba-CN / etc.
  • No support for static API keys. Users who want that can use a different opencode provider entry.
  • No custom SSE parser, tool-call normalizer, or message rewriter. @ai-sdk/openai-compatible already does SSE/reasoning_content correctly.

Architecture

Three files, 1 job each. Do not add a fourth unless the existing three genuinely can't hold a new concern.

File Responsibility
src/constants.ts Pinned strings that must mirror upstream kimi-cli (version, endpoints, client id, scope).
src/headers.ts The seven X-Msh-* / UA headers + the persistent ~/.kimi/device_id file.
src/oauth.ts Device-code start, device-code poll, refresh-token exchange, and GET /coding/v1/models discovery.
src/index.ts Plugin entry. Wires auth hook (login + loader) and chat.params hook.

Data flow on a chat request:

  1. opencode asks the @ai-sdk/openai-compatible provider for a language model.
  2. Before instantiating it, opencode calls our auth.loader. We return { apiKey, fetch }.
  3. The SDK uses our fetch for every HTTP call (models, chat, whatever).
  4. Our fetch calls ensureFresh() → maybe refreshes → sets Authorization + the seven X-Msh-* headers → on 401 refreshes once and retries.
  5. Separately, opencode runs the chat.params hook and writes thinking, reasoning_effort, prompt_cache_key into output.options. opencode wraps those as { [providerID]: options } and the openai-compatible SDK forwards them as top-level body fields. That is why those keys must use exactly the wire names (prompt_cache_key, reasoning_effort, thinking).

Contracts to keep intact

These are the invariants that, if broken, silently degrade K2.6 → K2.5 or produce fingerprint-based throttling. Do not "clean them up" without reading the linked upstream.

  1. X-Msh-Version and User-Agent must track kimi-cli. Bumping involves exactly one line in src/constants.ts. See upstream research/kimi-cli/src/kimi_cli/constant.py. The UA prefix is KimiCLI/ (not KimiCodeCLI/) — Moonshot's kimi-for-coding backend 403s with access_terminated_error: only available for Coding Agents such as Kimi CLI, Claude Code, Roo Code… on any other prefix. Likewise, X-Msh-Device-Model is "{system} {release} {machine}" (e.g. Linux 7.0.0 x86_64) — NOT just {arch} — and X-Msh-Os-Version is the kernel build string from os.version(), NOT "{type} {release}". Tested live against api.kimi.com/coding/v1 on 2026-04-17 — any of those three fields off-spec → 403.

  2. X-Msh-Device-Id must be stable across runs. Never regenerate a fresh UUID at import time. getDeviceId() reads/writes ~/.kimi/device_id; that path is shared with kimi-cli on purpose.

  3. Authorization header is owned by loader.fetch. Anything else (opencode core, the SDK, future hooks) must be overridden. Our loader deletes both authorization and Authorization before setting its own.

  4. Effort ↔ fields mapping (kimi-cli llm.py / kosong/chat_provider/kimi.py):

    Effort reasoning_effort thinking
    auto (omitted) (omitted)
    off (omitted) {type:"disabled"}
    low "low" {type:"enabled"}
    medium "medium" {type:"enabled"}
    high "high" {type:"enabled"}

    auto is the "let the server decide dynamically" variant — neither field is sent, matching kimi-cli's "nothing passed" default. When no effort is set at all, the plugin still emits thinking: {type: "enabled"} because the model is a reasoner. Gate the hook on input.model.providerID — NOT input.provider.info.id. The @opencode-ai/plugin ProviderContext type claims .info.id exists, but the runtime shape opencode passes (see research/opencode/packages/opencode/src/session/llm.ts::stream, ~line 168, provider: item) is the flat ProviderConfig (.id). input.model.providerID is what every first-party plugin uses (cloudflare.ts, codex.ts, github-copilot/copilot.ts) and it avoids the runtime crash "undefined is not an object (evaluating 'input.provider.info.id')". Tested live 2026-04-17.

  5. prompt_cache_key only for kimi-for-coding. Never attach it to unrelated models. The check is input.model.id === MODEL_ID in chat.params.

  6. Wire model id comes from /coding/v1/models, not from user config. The opencode-side model id is a stable alias (MODEL_ID = "kimi-for-coding"); the plugin calls GET /coding/v1/models at login and on every token refresh (mirroring kimi-cli's refresh_managed_models in research/kimi-cli/src/kimi_cli/auth/platforms.py), caches the first returned {id, context_length, display_name} in auth.json as model_id/context_length/model_display, and rewrites the JSON body model field inside loader.fetch whenever the cached id differs from MODEL_ID (the K2.5 case — server may return k2p5 instead). K2.6 accounts see id: "kimi-for-coding" and the rewrite is a no-op. Do not strip the kimi- prefix; send whatever the server returned. Discovery failures are non-fatal (stale cached id still works; 401 retry flushes broken tokens).

  7. Auth store is opencode's, not kimi-cli's. We use client.auth.get/set against the kimi-for-coding-oauth provider id. Do not read/write ~/.kimi/credentials/kimi-code.json; that's kimi-cli's file and sharing it across independent apps causes token-race bugs.

  8. Provider id must not collide with any id in the models.dev catalog. models.dev publishes kimi-for-coding (static KIMI_API_KEY@ai-sdk/anthropic → K2.5). If we registered under that same id, opencode auth login kimi-for-coding would surface two methods under one entry and users picking the API-key one would silently land on K2.5. We deliberately use kimi-for-coding-oauth instead; MODEL_ID on the wire stays kimi-for-coding (rule 6).

  9. src/index.ts must have exactly one export — the default plugin function. opencode's plugin loader (research/opencode/packages/opencode/src/plugin/index.tsgetLegacyPlugins) iterates every export of the plugin module and throws Plugin export is not a function if any named export is not callable. The failure mode is silent in the CLI (the provider just doesn't appear in opencode auth login); the error only surfaces in ~/.local/share/opencode/log/*.log. Keep constants in src/constants.ts and import them in src/index.ts rather than re-exporting. test/exports.test.ts guards this.

Working on this repo

  • Code style: see tsconfig.json (strict, noUncheckedIndexedAccess, ES2022). Prefer small pure functions, avoid try/catch except where we genuinely convert one error shape to another.
  • Comments: match the existing density — only explain non-obvious upstream-parity reasoning. Do not narrate the obvious ("// refresh the token"); instead reference upstream files when the reasoning is "because kimi-cli does it that way".
  • Dependencies: runtime deps stay at zero. The only dev/peer dep is @opencode-ai/plugin for types.
  • Git commits: small, logical, imperative subject ("Add oauth device flow"). Do not add a Co-authored-by trailer.
  • Upstream research: the research/ directory is a read-only git-ignored pair of shallow clones (opencode + kimi-cli) for grep. Never edit files there; re-clone if you suspect drift. When citing upstream in a comment, use the research/… path so the reference is resolvable.
  • Version bumps: when kimi-cli bumps, (1) pull a fresh research/kimi-cli, (2) update KIMI_CLI_VERSION in src/constants.ts, (3) re-diff _kimi_default_headers() / oauth.py against src/headers.ts and src/oauth.ts, (4) smoke-test with opencode auth login kimi-for-coding-oauth and a one-turn chat, (5) tag release.
  • Tests: test/ holds one file per source file plus test/exports.test.ts (the rule-9 guard). Tests mock fetch via test/_util/fetchMock.ts; no real credentials or network. They use the real ~/.kimi/device_id on purpose — it is shared with kimi-cli by design and getDeviceId is idempotent, so tests don't clobber state. When adding a new contract to the list above, add the matching offline check to the corresponding test file rather than creating new ones.

What not to do

  • Don't add heuristics that look at the model id outside of chat.params. The auth.loader fetch is already scoped to this provider; the only place that needs to match on kimi-for-coding is the params hook.
  • Don't rename the provider id back to kimi-for-coding or to anything else listed in models.dev. See rule 8.
  • Don't add new header values that kimi-cli doesn't send. The fingerprint matters.
  • Don't call out to other files to "share" the kimi-cli credentials. Different OAuth consumers must have independent refresh-token chains or one will invalidate the other.
  • Don't introduce a build step. The plugin ships as .ts and opencode's bun-based loader handles it.
  • Don't add tests that require real Kimi credentials and check them in. If you add offline unit tests, put them under test/ and mock fetch.
  • Don't add named exports to src/index.ts. See rule 9.

How to verify a change

Offline:

bunx tsc --noEmit                                  # type-check
bun build --target=node --no-bundle src/index.ts   # syntax check
bun test                                           # offline unit tests

Online (requires a real Kimi-for-coding account):

  1. cd ~/.opencode && bun add /path/to/this/repo
  2. Paste the provider block from README.md into your opencode config.
  3. opencode auth login kimi-for-coding-oauth — confirm a token lands in opencode's auth.json with type: "oauth", a JWT access, and expires ~15 min in the future.
  4. Start opencode, select kimi-for-coding-oauth/kimi-for-coding, and ask the model to self-identify. It should claim to be K2.6 / kimi-for-coding.
  5. Confirm reasoning_content deltas render as thinking content (not assistant text).
  6. In a second turn of the same session, confirm the response comes back faster (cache hit via prompt_cache_key).

If any of 36 fails, diff research/kimi-cli against the contracts above.

House rules for AI agents

  • Read this file first. Every time.
  • Don't grow the dependency footprint to "simplify" something; this plugin's value is being small and audit-able.
  • When in doubt, mirror kimi-cli exactly, then comment the upstream reference. "We used to deviate, it broke" — document it here.
  • Keep README.md user-focused and this file contributor-focused. If you catch yourself duplicating, move content here and link from the README.
  • Any new rule you add here must have a real incident or a grep-verified upstream source behind it. No speculative "best practices".