13 KiB
AGENTS.md — working notes for coding agents (and humans)
This file is the single source of truth for any AI agent (or human) modifying this repo. Read it top-to-bottom before touching code. If something you learn here contradicts what you see in the code, the code wins — update this file in the same commit.
User-facing install / usage documentation lives in README.md. Do not duplicate it here.
Purpose
One plugin, one job: make opencode talk to Kimi's kimi-for-coding endpoint exactly the way the official kimi-cli does. Everything in this repo exists to minimize drift from upstream kimi-cli.
The one rule that matters
Moonshot's backend picks the model (K2.5 vs K2.6) from the auth token type, not the model-name string.
- Static
sk-kimi-...API key → K2.5. - OAuth JWT with
scope: kimi-code→ K2.6.
Every design decision here follows from that: we do device-flow OAuth, we do not accept API keys, we do not let the upstream SDK attach its own Authorization header.
Non-goals
- No support for K2.5 or any non-
kimi-for-codingmodel. opencode already handles those via Moonshot / Baseten / Alibaba-CN / etc. - No support for static API keys. Users who want that can use a different opencode provider entry.
- No custom SSE parser, tool-call normalizer, or message rewriter.
@ai-sdk/openai-compatiblealready does SSE/reasoning_contentcorrectly.
Architecture
Three files, 1 job each. Do not add a fourth unless the existing three genuinely can't hold a new concern.
| File | Responsibility |
|---|---|
src/constants.ts |
Pinned strings that must mirror upstream kimi-cli (version, endpoints, client id, scope). |
src/headers.ts |
The seven X-Msh-* / UA headers + the persistent ~/.kimi/device_id file. |
src/oauth.ts |
Device-code start, device-code poll, refresh-token exchange, and GET /coding/v1/models discovery. |
src/index.ts |
Plugin entry. Wires auth (login + loader) plus the Kimi chat hooks/body rewrite. |
Data flow on a chat request:
- opencode asks the
@ai-sdk/openai-compatibleprovider for a language model. - Before instantiating it, opencode calls our
auth.loader. We return{ apiKey, fetch }. - The SDK uses our
fetchfor every HTTP call (models, chat, whatever). - Our
fetchcallsensureFresh()→ maybe refreshes → lazily discovers/coding/v1/modelswhen needed → sets Authorization + the sevenX-Msh-*headers → on 401 refreshes once and retries. - Separately, opencode runs
chat.headersandchat.params.chat.headerscomputesthinking,reasoning_effort, andprompt_cache_keyfrominput.model.optionsplus the selectedinput.message.model.variant, then passes them toloader.fetchvia privatex-opencode-kimi-*headers.loader.fetchstrips those headers and injects the wire fields into the JSON body.chat.paramsmirrors the same keys intooutput.optionsonly as a forward-compat fallback if opencode later fixes its openai-compatible providerOptions namespace mismatch.
Contracts to keep intact
These are the invariants that, if broken, silently degrade K2.6 → K2.5 or produce fingerprint-based throttling. Do not "clean them up" without reading the linked upstream.
-
X-Msh-VersionandUser-Agentmust trackkimi-cli. Bumping involves exactly one line insrc/constants.ts. See upstreamresearch/kimi-cli/src/kimi_cli/constant.py. The UA prefix isKimiCLI/(notKimiCodeCLI/) — Moonshot'skimi-for-codingbackend 403s withaccess_terminated_error: only available for Coding Agents such as Kimi CLI, Claude Code, Roo Code…on any other prefix. Likewise,X-Msh-Device-Modelmust mirror kimi-cli's_device_model()shape, including the Darwin/Windows special cases (macOS <version> <arch>,Windows 10/11 <arch>, Linux"{system} {release} {machine}") — NOT just{arch}— andX-Msh-Os-Versionis the kernel build string fromos.version(), NOT"{type} {release}". Tested live againstapi.kimi.com/coding/v1on 2026-04-17 — any of those three fields off-spec → 403. -
X-Msh-Device-Idmust be stable across runs. Never regenerate a fresh UUID at import time.getDeviceId()reads/writes~/.kimi/device_id; that path is shared withkimi-clion purpose. -
Authorizationheader is owned byloader.fetch. Anything else (opencode core, the SDK, future hooks) must be overridden. Ourloaderdeletes bothauthorizationandAuthorizationbefore setting its own. The privatex-opencode-kimi-*transport headers are also consumed and stripped there; they must never leak upstream. -
Effort ↔ fields mapping (kimi-cli
llm.py/kosong/chat_provider/kimi.py):Effort reasoning_effortthinkingauto(omitted) (omitted) off(omitted) {type:"disabled"}low"low"{type:"enabled"}medium"medium"{type:"enabled"}high"high"{type:"enabled"}autois the "let the server decide dynamically" variant — neither field is sent, matching kimi-cli's "nothing passed" default. When no effort is set at all, the plugin still emitsthinking: {type: "enabled"}because the model is a reasoner. Compute this frominput.model.optionsplusinput.model.variants[input.message.model.variant], not frominput.provider.info.id. The@opencode-ai/pluginProviderContexttype claims.info.idexists, but the runtime shape opencode passes (seeresearch/opencode/packages/opencode/src/session/llm.ts::stream, ~line 168,provider: item) is the flatProviderConfig(.id).input.model.providerIDis what every first-party plugin uses (cloudflare.ts, codex.ts, github-copilot/copilot.ts) and it avoids the runtime crash "undefined is not an object (evaluating 'input.provider.info.id')". Tested live 2026-04-17. -
prompt_cache_keyonly forkimi-for-coding. Never attach it to unrelated models. The check isinput.model.id === MODEL_IDin the Kimi chat hooks, and the actual wire injection happens inloader.fetch. -
Wire model id comes from
/coding/v1/models, not from user config. The opencode-side model id is a stable alias (MODEL_ID = "kimi-for-coding"); the plugin callsGET /coding/v1/modelsat login and on every token refresh (mirroring kimi-cli'srefresh_managed_modelsinresearch/kimi-cli/src/kimi_cli/auth/platforms.py), caches the first returned{id, context_length, display_name}in loader memory, and rewrites the JSON bodymodelfield insideloader.fetchwhenever the discovered id differs fromMODEL_ID(the K2.5 case — server may returnk2p5instead). A new loader instance re-discovers on first use if needed. K2.6 accounts seeid: "kimi-for-coding"and the rewrite is a no-op. Do not strip thekimi-prefix; send whatever the server returned. Discovery failures are non-fatal (warm cached id still works; 401 retry flushes broken tokens). -
Auth store is opencode's, not kimi-cli's. We use opencode's auth store for tokens under the
kimi-for-coding-oauthprovider id. Do not read/write~/.kimi/credentials/kimi-code.json; that's kimi-cli's file and sharing it across independent apps causes token-race bugs. Also note that opencode's SDK auth schema only persists the standard oauth fields, so model discovery metadata cannot be stored there durably. -
Provider id must not collide with any id in the models.dev catalog. models.dev publishes
kimi-for-coding(staticKIMI_API_KEY→@ai-sdk/anthropic→ K2.5). If we registered under that same id,opencode auth login kimi-for-codingwould surface two methods under one entry and users picking the API-key one would silently land on K2.5. We deliberately usekimi-for-coding-oauthinstead;MODEL_IDon the wire stayskimi-for-coding(rule 6). -
src/index.tsmust have exactly one export — the default plugin function. opencode's plugin loader (research/opencode/packages/opencode/src/plugin/index.ts→getLegacyPlugins) iterates every export of the plugin module and throwsPlugin export is not a functionif any named export is not callable. The failure mode is silent in the CLI (the provider just doesn't appear inopencode auth login); the error only surfaces in~/.local/share/opencode/log/*.log. Keep constants insrc/constants.tsand import them insrc/index.tsrather than re-exporting.test/exports.test.tsguards this.
Working on this repo
- Code style: see
tsconfig.json(strict,noUncheckedIndexedAccess, ES2022). Prefer small pure functions, avoidtry/catchexcept where we genuinely convert one error shape to another. - Comments: match the existing density — only explain non-obvious upstream-parity reasoning. Do not narrate the obvious ("// refresh the token"); instead reference upstream files when the reasoning is "because kimi-cli does it that way".
- Dependencies: runtime deps stay at zero. The only dev/peer dep is
@opencode-ai/pluginfor types. - Git commits: small, logical, imperative subject ("Add oauth device flow"). Do not add a
Co-authored-bytrailer. - Upstream research: the
research/directory is a read-only git-ignored pair of shallow clones (opencode + kimi-cli) for grep. Never edit files there; re-clone if you suspect drift. When citing upstream in a comment, use theresearch/…path so the reference is resolvable. - Version bumps: when kimi-cli bumps, (1) pull a fresh
research/kimi-cli, (2) updateKIMI_CLI_VERSIONinsrc/constants.ts, (3) re-diff_kimi_default_headers()/oauth.pyagainstsrc/headers.tsandsrc/oauth.ts, (4) smoke-test withopencode auth login kimi-for-coding-oauthand a one-turn chat, (5) tag release. - Tests:
test/holds one file per source file plustest/exports.test.ts(the rule-9 guard). Tests mockfetchviatest/_util/fetchMock.ts; no real credentials or network. They use the real~/.kimi/device_idon purpose — it is shared with kimi-cli by design andgetDeviceIdis idempotent, so tests don't clobber state. When adding a new contract to the list above, add the matching offline check to the corresponding test file rather than creating new ones.
What not to do
- ❌ Don't add heuristics that look at the model id outside of the Kimi chat hooks /
loader.fetch. The auth loader is already scoped to this provider; only the chat hooks and the body rewrite need to match onkimi-for-coding. - ❌ Don't rename the provider id back to
kimi-for-codingor to anything else listed in models.dev. See rule 8. - ❌ Don't add new header values that kimi-cli doesn't send. The fingerprint matters.
- ❌ Don't call out to other files to "share" the kimi-cli credentials. Different OAuth consumers must have independent refresh-token chains or one will invalidate the other.
- ❌ Don't introduce a build step. The plugin ships as
.tsand opencode's bun-based loader handles it. - ❌ Don't add tests that require real Kimi credentials and check them in. If you add offline unit tests, put them under
test/and mockfetch. - ❌ Don't add named exports to
src/index.ts. See rule 9.
How to verify a change
Offline:
bunx tsc --noEmit # type-check
bun build --target=node --no-bundle src/index.ts # syntax check
bun test # offline unit tests
Online (requires a real Kimi-for-coding account):
- Install the local checkout via opencode's plugin flow (
opencode plugin /path/to/this/repo --global) or point thepluginarray in your opencode config at the repo root, as shown inREADME.md. - Paste the provider block from
README.mdinto your opencode config. opencode auth login kimi-for-coding-oauth— confirm a token lands in opencode'sauth.jsonwithtype: "oauth", a JWTaccess, andexpires~15 min in the future.- Start opencode, select
kimi-for-coding-oauth/kimi-for-coding, and ask the model to self-identify. It should claim to be K2.6 /kimi-for-coding. - Confirm
reasoning_contentdeltas render as thinking content (not assistant text). - In a second turn of the same session, confirm the response comes back faster (cache hit via
prompt_cache_key).
If any of 3–6 fails, diff research/kimi-cli against the contracts above.
House rules for AI agents
- Read this file first. Every time.
- Don't grow the dependency footprint to "simplify" something; this plugin's value is being small and audit-able.
- When in doubt, mirror kimi-cli exactly, then comment the upstream reference. "We used to deviate, it broke" — document it here.
- Keep
README.mduser-focused and this file contributor-focused. If you catch yourself duplicating, move content here and link from the README. - Any new rule you add here must have a real incident or a grep-verified upstream source behind it. No speculative "best practices".