Mapped model fallback router for OpenCode.
There are situations where you may want to use the quota that comes with a subscription first, then fall back to an API pay-as-you-go model only when that subscription-backed model is rate-limited or usage-limited. You can solve that with a local proxy, but maintaining a proxy server is often not worth it if all you need is a simple one-to-one fallback inside OpenCode. This plugin handles that routing directly in OpenCode.
When a configured model hits a retryable provider failure, this plugin aborts the in-flight request, replays the latest user message on the mapped fallback model, persists a global cooldown for the failed model, and routes back to the original model after the cooldown expires.
Add to your OpenCode config at ~/.config/opencode/config.json:
{
"plugin": ["@renjfk/opencode-model-fallback"]
}If you want to set plugin options, use the tuple form:
{
"plugin": [
[
"@renjfk/opencode-model-fallback",
{
"mappings": {
"openai/gpt-5.4": "azure-ai-foundry/gpt-5.4",
"openai/gpt-5.5": "azure-ai-foundry/gpt-5.5"
}
}
]
]
}mappings: map original model IDs to fallback model IDs.retry_on_errors: retryable HTTP status codes. Defaults to429.retryable_error_patterns: retryable error message patterns. Defaults to["rate.?limit"].cooldown_seconds: how long a failed original model remains on fallback. Defaults to3600.timeout_seconds: abort and retry if a response is inactive for this long. Defaults to30.notify_on_fallback: show fallback/recovery toasts. Defaults totrue.
The plugin watches OpenCode chat and session events. When a request uses a model
listed in mappings, that model is preferred unless it has an active global
cooldown. If OpenCode reports a retryable provider failure, the plugin switches
to the mapped fallback model and stores a global cooldown for the failed model.
Global model cooldowns are persisted at:
~/.local/share/opencode/mapped-fallback-router.json
Persisted cooldowns let all sessions avoid immediately retrying a model that has just failed. When the cooldown expires, mapped requests are routed back to the original model.
The plugin does not load balance, race models, or retry through a chain. Each mapping is one original model to one fallback model.
If you send a message with openai/gpt-5.5 and the model has a mapping, the
request goes to openai/gpt-5.5 normally unless it has an active cooldown.
If you select the mapped fallback model directly, the plugin still routes back
to the original model unless the original has an active cooldown.
If OpenCode reports a retryable provider failure such as a rate limit or a configured retryable status code, the plugin aborts the current request and replays the latest user message on the mapped fallback model.
For example:
{
"mappings": {
"openai/gpt-5.5": "azure-ai-foundry/gpt-5.5"
}
}If openai/gpt-5.5 fails with a retryable error, the session continues on
azure-ai-foundry/gpt-5.5.
After fallback is triggered, the original model is considered cooling down for
cooldown_seconds. During that cooldown, mapped requests use the fallback model
instead of switching back and immediately hitting the same provider
failure again.
All sessions are routed straight to the fallback while the original model is cooling down.
When the cooldown expires, mapped requests switch back to the original model.
Mappings are one-to-one. If the fallback model also hits a retryable failure, there is no next fallback to try. The plugin shows a fallback exhausted toast when notifications are enabled.
Use OpenCode's provider logs to find the exact status code, headers, and error
body returned by a provider. This is the most reliable way to tune
retry_on_errors and retryable_error_patterns.
For a short headless reproduction, capture logs and stop the run after a few seconds to avoid long retry loops:
log="/tmp/opencode-provider.log"
: > "$log"
opencode run --print-logs --log-level DEBUG --model openai/gpt-5.3-codex --format json "Reply with OK only." 2> "$log" &
pid=$!
sleep 3
kill "$pid" 2>/dev/null || true
wait "$pid" 2>/dev/null || trueThen inspect the captured provider errors:
rg 'service=llm|AI_APICallError|statusCode|responseBody|x-codex|reset|usage_limit|rate.?limit' /tmp/opencode-provider.logLook for an OpenCode log line like:
ERROR ... service=llm providerID=openai modelID=gpt-5.3-codex ... error={...}
Inside error, check fields such as statusCode, responseHeaders,
responseBody, isRetryable, and data.error.message. For example, OpenAI
usage limits can appear as statusCode: 429 with a response body containing
usage_limit_reached and The usage limit has been reached. OpenAI Codex
responses can also include reset headers such as x-codex-primary-reset-at,
x-codex-primary-reset-after-seconds, x-codex-secondary-reset-at, and
x-codex-secondary-reset-after-seconds.
For TUI sessions, start OpenCode the same way and reproduce manually:
opencode --print-logs --log-level DEBUG 2> /tmp/opencode-provider.logUse the provider statusCode and response body text to tune the retry rules:
{
"plugin": [
[
"@renjfk/opencode-model-fallback",
{
"retry_on_errors": [429, 403],
"retryable_error_patterns": ["rate.?limit", "usage.?limit"],
"mappings": {
"openai/gpt-5.5": "azure-ai-foundry/gpt-5.5"
}
}
]
]
}If the status code is not in retry_on_errors, add it. If the response body has
stable text or an error type, add a small regex matching it to
retryable_error_patterns. If there is no service=llm error line, OpenCode did
not reach the provider or the run was stopped before the provider returned.
opencode-model-fallback is open to contributions and ideas!
Format: type: brief description
feat:new features or functionalityfix:bug fixesenhance:improvements to existing featureschore:maintenance tasks, dependencies, cleanupdocs:documentation updatesbuild:build system, CI/CD changes
npm run test # node test suite
npm run check # test + lint + fmt
npm run lint # oxlint
npm run fmt # oxfmt --check
npm run fmt:fix # oxfmt --writeTo test unpublished changes in the OpenCode TUI, point ~/.config/opencode/config.json
at the local repo path, not the npm package name:
{
"plugin": ["/Users/your-user/opencode-model-fallback"]
}Manual releases via opencode; see RELEASE_PROCESS.md.
This project is licensed under the MIT License.