8.9 KiB
AGENTS.md — working notes for coding agents (and humans)
This file is the single source of truth for any AI agent (or human) modifying this repo. Read it top-to-bottom before touching code. If something you learn here contradicts what you see in the code, the code wins — update this file in the same commit.
User-facing install / usage documentation lives in README.md. Do not duplicate it here.
Purpose
One plugin, one job: make opencode talk to Kimi's kimi-for-coding endpoint exactly the way the official kimi-cli does. Everything in this repo exists to minimize drift from upstream kimi-cli.
The one rule that matters
Moonshot's backend picks the model (K2.5 vs K2.6) from the auth token type, not the model-name string.
- Static
sk-kimi-...API key → K2.5. - OAuth JWT with
scope: kimi-code→ K2.6.
Every design decision here follows from that: we do device-flow OAuth, we do not accept API keys, we do not let the upstream SDK attach its own Authorization header.
Non-goals
- No support for K2.5 or any non-
kimi-for-codingmodel. opencode already handles those via Moonshot / Baseten / Alibaba-CN / etc. - No support for static API keys. Users who want that can use a different opencode provider entry.
- No custom SSE parser, tool-call normalizer, or message rewriter.
@ai-sdk/openai-compatiblealready does SSE/reasoning_contentcorrectly.
Architecture
Three files, 1 job each. Do not add a fourth unless the existing three genuinely can't hold a new concern.
| File | Responsibility |
|---|---|
src/constants.ts |
Pinned strings that must mirror upstream kimi-cli (version, endpoints, client id, scope). |
src/headers.ts |
The seven X-Msh-* / UA headers + the persistent ~/.kimi/device_id file. |
src/oauth.ts |
Device-code start, device-code poll, refresh-token exchange. Nothing else. |
src/index.ts |
Plugin entry. Wires auth hook (login + loader) and chat.params hook. |
Data flow on a chat request:
- opencode asks the
@ai-sdk/openai-compatibleprovider for a language model. - Before instantiating it, opencode calls our
auth.loader. We return{ apiKey, fetch }. - The SDK uses our
fetchfor every HTTP call (models, chat, whatever). - Our
fetchcallsensureFresh()→ maybe refreshes → sets Authorization + the sevenX-Msh-*headers → on 401 refreshes once and retries. - Separately, opencode runs the
chat.paramshook and writesthinking,reasoning_effort,prompt_cache_keyintooutput.options. opencode wraps those as{ [providerID]: options }and the openai-compatible SDK forwards them as top-level body fields. That is why those keys must use exactly the wire names (prompt_cache_key,reasoning_effort,thinking).
Contracts to keep intact
These are the invariants that, if broken, silently degrade K2.6 → K2.5 or produce fingerprint-based throttling. Do not "clean them up" without reading the linked upstream.
-
X-Msh-VersionandUser-Agentmust trackkimi-cli. Bumping involves exactly one line insrc/constants.ts. See upstreamresearch/kimi-cli/src/kimi_cli/constant.py. -
X-Msh-Device-Idmust be stable across runs. Never regenerate a fresh UUID at import time.getDeviceId()reads/writes~/.kimi/device_id; that path is shared withkimi-clion purpose. -
Authorizationheader is owned byloader.fetch. Anything else (opencode core, the SDK, future hooks) must be overridden. Ourloaderdeletes bothauthorizationandAuthorizationbefore setting its own. -
Effort ↔ fields mapping (kimi-cli
llm.py/kosong/chat_provider/kimi.py):Effort reasoning_effortthinking.typeoff(omitted) "disabled"low"low""enabled"medium"medium""enabled"high"high""enabled"Do not send
thinking.type="enabled"with noreasoning_effortunless the request never had one to begin with (the default "server picks" case). -
prompt_cache_keyonly forkimi-for-coding. Never attach it to unrelated models. The check isinput.model.id === MODEL_IDinchat.params. -
Model id goes over the wire verbatim. Don't strip the
kimi-prefix — the backend expects exactlykimi-for-coding. -
Auth store is opencode's, not kimi-cli's. We use
client.auth.get/setagainst thekimi-for-coding-oauthprovider id. Do not read/write~/.kimi/credentials/kimi-code.json; that's kimi-cli's file and sharing it across independent apps causes token-race bugs. -
Provider id must not collide with any id in the models.dev catalog. models.dev publishes
kimi-for-coding(staticKIMI_API_KEY→@ai-sdk/anthropic→ K2.5). If we registered under that same id,opencode auth login kimi-for-codingwould surface two methods under one entry and users picking the API-key one would silently land on K2.5. We deliberately usekimi-for-coding-oauthinstead;MODEL_IDon the wire stayskimi-for-coding(rule 6).
Working on this repo
- Code style: see
tsconfig.json(strict,noUncheckedIndexedAccess, ES2022). Prefer small pure functions, avoidtry/catchexcept where we genuinely convert one error shape to another. - Comments: match the existing density — only explain non-obvious upstream-parity reasoning. Do not narrate the obvious ("// refresh the token"); instead reference upstream files when the reasoning is "because kimi-cli does it that way".
- Dependencies: runtime deps stay at zero. The only dev/peer dep is
@opencode-ai/pluginfor types. - Git commits: small, logical, imperative subject ("Add oauth device flow"). Do not add a
Co-authored-bytrailer. - Upstream research: the
research/directory is a read-only git-ignored pair of shallow clones (opencode + kimi-cli) for grep. Never edit files there; re-clone if you suspect drift. When citing upstream in a comment, use theresearch/…path so the reference is resolvable. - Version bumps: when kimi-cli bumps, (1) pull a fresh
research/kimi-cli, (2) updateKIMI_CLI_VERSIONinsrc/constants.ts, (3) re-diff_kimi_default_headers()/oauth.pyagainstsrc/headers.tsandsrc/oauth.ts, (4) smoke-test withopencode auth login kimi-for-coding-oauthand a one-turn chat, (5) tag release.
What not to do
- ❌ Don't add heuristics that look at the model id outside of
chat.params. Theauth.loaderfetch is already scoped to this provider; the only place that needs to match onkimi-for-codingis the params hook. - ❌ Don't rename the provider id back to
kimi-for-codingor to anything else listed in models.dev. See rule 8. - ❌ Don't add new header values that kimi-cli doesn't send. The fingerprint matters.
- ❌ Don't call out to other files to "share" the kimi-cli credentials. Different OAuth consumers must have independent refresh-token chains or one will invalidate the other.
- ❌ Don't introduce a build step. The plugin ships as
.tsand opencode's bun-based loader handles it. - ❌ Don't add tests that require real Kimi credentials and check them in. If you add offline unit tests, put them under
test/and mockfetch.
How to verify a change
Offline:
bun build --target=node --no-bundle src/index.ts # syntax/type-ish check
Online (requires a real Kimi-for-coding account):
cd ~/.opencode && bun add /path/to/this/repo- Paste the provider block from
README.mdinto your opencode config. opencode auth login kimi-for-coding-oauth— confirm a token lands in opencode'sauth.jsonwithtype: "oauth", a JWTaccess, andexpires~15 min in the future.- Start opencode, select
kimi-for-coding-oauth/kimi-for-coding, and ask the model to self-identify. It should claim to be K2.6 /kimi-for-coding. - Confirm
reasoning_contentdeltas render as thinking content (not assistant text). - In a second turn of the same session, confirm the response comes back faster (cache hit via
prompt_cache_key).
If any of 3–6 fails, diff research/kimi-cli against the contracts above.
House rules for AI agents
- Read this file first. Every time.
- Don't grow the dependency footprint to "simplify" something; this plugin's value is being small and audit-able.
- When in doubt, mirror kimi-cli exactly, then comment the upstream reference. "We used to deviate, it broke" — document it here.
- Keep
README.mduser-focused and this file contributor-focused. If you catch yourself duplicating, move content here and link from the README. - Any new rule you add here must have a real incident or a grep-verified upstream source behind it. No speculative "best practices".