Backfill Kimi image input capability

This commit is contained in:
lemon07r
2026-04-23 01:28:44 -04:00
parent a40a340dd2
commit 5efc6c39a3
5 changed files with 195 additions and 22 deletions
+1
View File
@@ -69,6 +69,7 @@ These are the invariants that, if broken, silently route requests onto the wrong
9. **`src/index.ts` must have exactly one export — the default `PluginModule` object `{ id, server }`.** opencode's plugin loader (`research/opencode/packages/opencode/src/plugin/index.ts`) first tries `readV1Plugin` (detect mode) on the default export. If it finds an object with `server` (and optional `id`), it uses the v1 path directly. The older legacy path (`getLegacyPlugins`) iterates every export and throws `Plugin export is not a function` on any non-callable value — a problem that surfaced on Windows where Bun's standalone-binary dynamic imports can produce module namespace objects with unexpected non-function metadata. The v1 format bypasses `getLegacyPlugins` entirely. Keep constants in `src/constants.ts` and import them in `src/index.ts` rather than re-exporting. `test/exports.test.ts` guards this. The failure mode of a broken export is silent in the CLI (the provider just doesn't appear in `opencode auth login`); the error only surfaces in `~/.local/share/opencode/log/*.log`.
10. **The post-login config hint must not emit a partial `limit` object.** opencode's live config schema at `https://opencode.ai/config.json` requires both `limit.context` and `limit.output` whenever `limit` is present, while Kimi's `GET /coding/v1/models` only gives us `context_length`. Therefore `buildConfigBlock()` omits `limit` entirely and leaves `provider.models` to backfill `limit.context` at runtime. Do not invent `output` or set `input` heuristically; opencode's overflow logic treats `limit.input` as authoritative (`research/opencode/packages/opencode/src/session/overflow.ts`).
11. **Concurrent refreshes must collapse to one in-flight OAuth exchange, even across plugin instances.** `provider.models` and `auth.loader` can both notice an expiring token at about the same time, and separate opencode workspace/plugin instances can inherit stale auth snapshots. `refreshAuth()` in `src/index.ts` therefore shares one promise across overlapping callers, takes a provider-scoped auth-store lock before refreshing, re-reads opencode's live auth-store entry under that lock, and treats a changed on-disk token chain as authoritative. `test/plugin.test.ts` covers loader-vs-loader, provider.models-vs-loader, cross-instance lock reuse, and the `invalid_grant` self-heal path where another process already rotated the refresh token.
12. **Image-input capability must be backfilled from `/coding/v1/models`.** `supports_image_in` from Kimi discovery is not cosmetic metadata: opencode's provider transform (`research/opencode/packages/opencode/src/provider/transform.ts::unsupportedParts`) rewrites every image part into local `ERROR: Cannot read ... (this model does not support image input)` text before the request reaches our loader when `capabilities.input.image` is false. Therefore `provider.models` must patch runtime model metadata for `kimi-for-coding`, and `buildConfigBlock()` must include `attachment: true` plus `modalities.input = ["text","image"]` / `modalities.output = ["text"]` when discovery says images are supported. `test/plugin.test.ts` covers both paths.
### Working on this repo
+3 -2
View File
@@ -9,7 +9,7 @@ Compared with stock opencode Kimi setups, this plugin:
- sends the same `User-Agent` / `X-Msh-*` fingerprint headers as `kimi-cli`
- reuses `~/.kimi/device_id` for `X-Msh-Device-Id`
- adds `prompt_cache_key`, `thinking`, and `reasoning_effort` for `kimi-for-coding` requests
- discovers the authoritative wire model slug, API display name, and context length from `/coding/v1/models`
- discovers the authoritative wire model slug, API display name, context length, and image-input capability from `/coding/v1/models`
- keeps tokens in opencode's auth store while mirroring `kimi-cli`'s refresh / retry behavior
That is the value of using this plugin instead of a plain opencode provider entry: it preserves the Kimi-only OAuth path, fingerprint, and request extensions that the generic route does not.
@@ -123,7 +123,7 @@ During login the plugin:
- shows a verification URL and user code
- stores the OAuth token in opencode's auth store
- discovers the exact model slug, display name, and context length your account should send to Kimi
- discovers the exact model slug, display name, context length, and image-input capability your account should send to Kimi
- prints a config hint that uses the discovered display name and leaves context backfill to runtime metadata discovery
Access tokens refresh automatically while you use the model.
@@ -152,6 +152,7 @@ Fastest fix:
<summary><strong>Login and refresh details</strong></summary>
- The plugin queries `/coding/v1/models` during login so it can discover the current wire model id and context length for your account.
- The plugin also uses that discovery response to backfill image-input support into opencode's runtime model metadata, so pasted or dropped images reach Kimi instead of being downgraded into local error text.
- The printed config hint intentionally omits `limit`, because opencode requires both `limit.context` and `limit.output`, while Kimi's models endpoint only exposes `context_length`.
- Model discovery runs again on every token refresh, and a fresh loader instance can re-query `/coding/v1/models` on first use if it needs the current wire model id.
- On a `401`, the loader refreshes the access token once and retries the request once.
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "opencode-kimi-full",
"version": "1.2.8",
"version": "1.2.9",
"description": "OpenCode plugin that brings the official Kimi OAuth device flow and Kimi-specific coding request fields to opencode, matching upstream kimi-cli.",
"license": "MIT",
"repository": {
+119 -14
View File
@@ -26,6 +26,7 @@ type ModelDiscovery = {
model_id?: string
context_length?: number
model_display?: string
supports_image_in?: boolean
}
type ThinkingType = "enabled" | "disabled"
@@ -38,9 +39,20 @@ type KimiBodyFields = {
type ModelWithDiscoveryMetadata = {
name?: string
attachment?: boolean
limit?: {
context?: number
}
modalities?: {
input?: string[]
output?: string[]
}
capabilities?: {
attachment?: boolean
input?: {
image?: boolean
}
}
}
type KimiHookInput = {
@@ -252,6 +264,7 @@ function pickModelInfo(models: KimiModelInfo[]): ModelDiscovery {
model_id: picked.id,
context_length: picked.context_length,
model_display: picked.display_name,
supports_image_in: picked.supports_image_in,
}
}
@@ -275,10 +288,89 @@ function withDiscoveredDisplayName<T extends ModelWithDiscoveryMetadata>(model:
}
}
function sameStrings(left: string[] | undefined, right: string[] | undefined) {
if (left === right) return true
if (!left || !right) return false
if (left.length !== right.length) return false
return left.every((value, index) => value === right[index])
}
function uniqueStrings(values: string[]) {
return [...new Set(values)]
}
function withDiscoveredImageInput<T extends ModelWithDiscoveryMetadata>(model: T, supportsImageIn: boolean | undefined): T {
if (supportsImageIn === undefined) return model
let changed = false
let nextAttachment = model.attachment
let nextModalities = model.modalities
let nextCapabilities = model.capabilities
if (supportsImageIn && model.attachment !== true) {
nextAttachment = true
changed = true
}
const currentInputModalities = model.modalities?.input
const currentOutputModalities = model.modalities?.output
const shouldPatchModalities = supportsImageIn || currentInputModalities?.includes("image") === true
if (shouldPatchModalities) {
const nextInputModalities = uniqueStrings([
"text",
...(currentInputModalities ?? []),
...(supportsImageIn ? ["image"] : []),
]).filter((value) => value !== "image" || supportsImageIn)
const nextOutputModalities = uniqueStrings(["text", ...(currentOutputModalities ?? [])])
if (
!sameStrings(currentInputModalities, nextInputModalities) ||
!sameStrings(currentOutputModalities, nextOutputModalities)
) {
nextModalities = {
...model.modalities,
input: nextInputModalities,
output: nextOutputModalities,
}
changed = true
}
}
const currentCapabilityImage = model.capabilities?.input?.image
const currentCapabilityAttachment = model.capabilities?.attachment
if (currentCapabilityImage !== undefined && currentCapabilityImage !== supportsImageIn) {
nextCapabilities = {
...nextCapabilities,
input: {
...nextCapabilities?.input,
image: supportsImageIn,
},
}
changed = true
}
if (supportsImageIn && currentCapabilityAttachment !== undefined && currentCapabilityAttachment !== true) {
nextCapabilities = {
...nextCapabilities,
attachment: true,
}
changed = true
}
if (!changed) return model
return {
...model,
...(nextAttachment === undefined ? {} : { attachment: nextAttachment }),
...(nextModalities ? { modalities: nextModalities } : {}),
...(nextCapabilities ? { capabilities: nextCapabilities } : {}),
}
}
function applyDiscoveryToModels<T extends Record<string, ModelWithDiscoveryMetadata>>(models: T, discovery: ModelDiscovery): T {
const current = models[MODEL_ID]
if (!current) return models
const next = withDiscoveredContext(withDiscoveredDisplayName(current, discovery.model_display), discovery.context_length)
const next = withDiscoveredImageInput(
withDiscoveredContext(withDiscoveredDisplayName(current, discovery.model_display), discovery.context_length),
discovery.supports_image_in,
)
if (next === current) return models
return {
...models,
@@ -286,7 +378,7 @@ function applyDiscoveryToModels<T extends Record<string, ModelWithDiscoveryMetad
}
}
function buildConfigBlock(info: { model_id: string; display?: string }) {
function buildConfigBlock(info: { model_id: string; display?: string; supports_image_in?: boolean }) {
const name = info.display ?? "Kimi For Coding"
// The opencode-side model key is always MODEL_ID ("kimi-for-coding"); the
// plugin rewrites the wire `model` body field to `info.model_id` inside
@@ -297,6 +389,29 @@ function buildConfigBlock(info: { model_id: string; display?: string }) {
// `limit.output` whenever a `limit` object is present, but Kimi's
// `/coding/v1/models` discovery only tells us `context_length`. The
// provider.models hook backfills `limit.context` at runtime.
const modelConfig: Record<string, unknown> = {
name,
reasoning: true,
options: {},
variants: {
off: { reasoning_effort: "off" },
auto: { reasoning_effort: "auto" },
low: { reasoning_effort: "low" },
medium: { reasoning_effort: "medium" },
high: { reasoning_effort: "high" },
},
}
if (info.supports_image_in) {
// opencode's provider transform gates image parts on model metadata
// before the request reaches our loader. Mirror Kimi's discovered
// capability here so pasted images survive into the upstream SDK.
modelConfig.attachment = true
modelConfig.modalities = {
input: ["text", "image"],
output: ["text"],
}
}
return JSON.stringify(
{
provider: {
@@ -305,18 +420,7 @@ function buildConfigBlock(info: { model_id: string; display?: string }) {
name: "Kimi For Coding (OAuth)",
options: { baseURL: API_BASE_URL },
models: {
[MODEL_ID]: {
name,
reasoning: true,
options: {},
variants: {
off: { reasoning_effort: "off" },
auto: { reasoning_effort: "auto" },
low: { reasoning_effort: "low" },
medium: { reasoning_effort: "medium" },
high: { reasoning_effort: "high" },
},
},
[MODEL_ID]: modelConfig,
},
},
},
@@ -635,6 +739,7 @@ const plugin: Plugin = async ({ client }) => {
const block = buildConfigBlock({
model_id: discovered.model_id,
display: discovered.model_display,
supports_image_in: discovered.supports_image_in,
})
console.log(
`\n✓ Authorized for Kimi For Coding (model: ${discovered.model_id}${
+71 -5
View File
@@ -292,19 +292,55 @@ function makeProviderState(context = 0) {
id: PROVIDER_ID,
models: {
[MODEL_ID]: {
id: MODEL_ID,
providerID: PROVIDER_ID,
api: {
id: MODEL_ID,
npm: "@ai-sdk/openai-compatible",
url: "https://api.kimi.com/coding/v1",
},
status: "active",
headers: {},
name: "Kimi For Coding",
reasoning: true,
options: {},
limit: { context },
cost: { input: 0, output: 0, cache: { read: 0, write: 0 } },
limit: { context, output: 8192 },
capabilities: {
temperature: false,
reasoning: true,
attachment: false,
toolcall: true,
input: { text: true, audio: false, image: false, video: false, pdf: false },
output: { text: true, audio: false, image: false, video: false, pdf: false },
interleaved: false,
},
variants: {
auto: { reasoning_effort: "auto" },
},
},
"some-other-model": {
id: "some-other-model",
providerID: PROVIDER_ID,
api: {
id: "some-other-model",
npm: "@ai-sdk/openai-compatible",
url: "https://api.kimi.com/coding/v1",
},
status: "active",
headers: {},
name: "Other",
reasoning: false,
options: {},
limit: { context: 1234 },
cost: { input: 0, output: 0, cache: { read: 0, write: 0 } },
limit: { context: 1234, output: 4096 },
capabilities: {
temperature: false,
reasoning: false,
attachment: false,
toolcall: true,
input: { text: true, audio: false, image: false, video: false, pdf: false },
output: { text: true, audio: false, image: false, video: false, pdf: false },
interleaved: false,
},
},
},
}
@@ -343,6 +379,25 @@ test("provider.models: surfaces discovered display_name in runtime model metadat
expect(provider.models[MODEL_ID]!.name).toBe("Kimi For Coding")
})
test("provider.models: surfaces discovered image input capability so opencode does not strip images", async () => {
mock = installFetchMock((call) => {
if (call.url.endsWith("/coding/v1/models")) {
return {
body: {
data: [{ id: MODEL_ID, display_name: "Kimi Code", context_length: 262144, supports_image_in: true }],
},
}
}
return { body: { ok: true } }
})
const { hooks } = await getHooks()
const provider = makeProviderState()
const next = await hooks.provider!.models!(provider as any, { auth: validAuth() } as any)
expect(next[MODEL_ID]!.capabilities.input.image).toBe(true)
expect(next[MODEL_ID]!.capabilities.attachment).toBe(true)
expect(provider.models[MODEL_ID]!.capabilities.input.image).toBe(false)
})
test("provider.models: preserves an explicit user context limit", async () => {
mock = installFetchMock((call) => {
if (call.url.endsWith("/coding/v1/models")) {
@@ -946,7 +1001,11 @@ test("auth callback prints a schema-valid config snippet with top-level model va
}
}
if (call.url.endsWith("/coding/v1/models")) {
return { body: { data: [{ id: "kimi-for-coding", display_name: "Kimi Code", context_length: 262144 }] } }
return {
body: {
data: [{ id: "kimi-for-coding", display_name: "Kimi Code", context_length: 262144, supports_image_in: true }],
},
}
}
return { body: { access_token: "A", refresh_token: "R", token_type: "Bearer", expires_in: 900 } }
})
@@ -974,7 +1033,9 @@ test("auth callback prints a schema-valid config snippet with top-level model va
[key: string]: {
models: {
[key: string]: {
attachment?: boolean
limit?: { context?: number }
modalities?: { input?: string[]; output?: string[] }
options?: Record<string, unknown>
variants?: Record<string, { reasoning_effort?: string }>
}
@@ -984,7 +1045,12 @@ test("auth callback prints a schema-valid config snippet with top-level model va
}
const model = parsed.provider[PROVIDER_ID]!.models[MODEL_ID]!
expect(text).toContain("context 262144")
expect(model.attachment).toBe(true)
expect(model.limit).toBeUndefined()
expect(model.modalities).toEqual({
input: ["text", "image"],
output: ["text"],
})
expect(model.options).toEqual({})
expect(model.variants?.off).toEqual({ reasoning_effort: "off" })
expect(model.variants?.auto).toEqual({ reasoning_effort: "auto" })