Skip to content

feat: dynamic Bedrock model listing with LiteLLM metadata enrichment#3352

Open
carTloyal123 wants to merge 1 commit into
tailcallhq:mainfrom
carTloyal123:feat/bedrock-dynamic-models
Open

feat: dynamic Bedrock model listing with LiteLLM metadata enrichment#3352
carTloyal123 wants to merge 1 commit into
tailcallhq:mainfrom
carTloyal123:feat/bedrock-dynamic-models

Conversation

@carTloyal123
Copy link
Copy Markdown

Summary

Replaces the hardcoded Bedrock model list with live AWS API calls, ensuring users only see models they have access to in their account and region. This solves the GovCloud model filtering problem and improves UX for commercial Bedrock users.

Problem

The Bedrock provider returns ~80+ hardcoded models from provider.json regardless of the user's account, region, or enabled models. GovCloud users see commercial-only models they can't use, and commercial users see models they haven't enabled in their Bedrock console.

Solution

Two-source architecture:

  1. AWS ListFoundationModels + ListInferenceProfiles -- Account/region-aware APIs that return only accessible models. Filtered to ON_DEMAND, streaming-capable, ACTIVE models.

  2. LiteLLM community registry (model_prices_and_context_window.json) -- Enriches each model with context_length, tools_supported, supports_reasoning, supports_vision metadata that AWS APIs don't provide.

Both sources gracefully degrade: AWS API failure falls back to the hardcoded list; LiteLLM failure just means missing metadata.

Changes

File Description
Cargo.toml Added aws-sdk-bedrock workspace dependency
crates/forge_repo/Cargo.toml Added aws-sdk-bedrock crate reference
crates/forge_repo/src/provider/bedrock.rs +613 lines: control plane client, live model listing, LiteLLM enrichment, FIPS support, 10 new tests
crates/forge_repo/src/provider/provider.json Added optional AWS_USE_FIPS URL param for GovCloud

Key Implementation Details

  • Control plane client (init_control_client()): Separate aws_sdk_bedrock::Client for model listing, same auth mode and region as the chat client
  • Model filtering: Only ON_DEMAND, streaming-capable, ACTIVE lifecycle models included
  • Inference profiles: Cross-region profiles (e.g., us.anthropic.claude-3-5-sonnet) merged and deduplicated
  • LiteLLM enrichment: Fetched and cached with 7-day TTL via existing model cache infra. Tries both raw model ID and Bedrock-prefixed ID for lookup
  • Fallback: If ListFoundationModels fails (missing IAM permission), returns hardcoded list with warning log
  • FIPS support: Optional AWS_USE_FIPS=true param enables FIPS endpoints on both data and control plane clients

Testing

  • 10 new unit tests (72 total pass, 1 ignored):
    • Model mapping: Claude, Nova, Llama foundation models
    • Registry enrichment: Bedrock prefix lookup, raw ID lookup, not-found handling, no-overwrite of existing data
    • Fallback: Hardcoded list returned on API error
    • FIPS: Config parsing for true/false/absent
  • cargo check passes across full workspace

GovCloud Usage

forge provider login  # select bedrock
# Choose "AWS Profile (SSO/IAM)"
# Enter GovCloud profile name
# Set AWS_REGION to us-gov-west-1
# Optionally set AWS_USE_FIPS to true
forge list models  # shows only GovCloud-available models

Co-Authored-By: ForgeCode noreply@forgecode.dev

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 17, 2026

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@github-actions github-actions Bot added type: feature Brand new functionality, features, pages, workflows, endpoints, etc. type: provider Updates provider.json configuration. labels May 17, 2026
Replace the hardcoded Bedrock model list with live AWS API calls to
ListFoundationModels and ListInferenceProfiles. This ensures users
only see models they have access to in their account and region,
solving the GovCloud model filtering problem.

Key changes:
- Add aws-sdk-bedrock dependency for control plane API
- Implement init_control_client() for model listing (separate from
  chat client)
- Call ListFoundationModels filtered to ON_DEMAND, streaming, ACTIVE
- Call ListInferenceProfiles for cross-region inference profiles
- Enrich model metadata (context_length, tools, reasoning, vision)
  from the LiteLLM community registry JSON
- Graceful fallback to hardcoded model list when API permissions
  are missing
- Add optional AWS_USE_FIPS URL param for GovCloud FIPS endpoints
- Add 10 new unit tests covering model mapping, enrichment,
  fallback, and FIPS config

GovCloud users with us-gov-west-1 region will now see only their
available models with correct context window sizes. Commercial
users benefit from seeing only models they have enabled.

Co-Authored-By: ForgeCode <noreply@forgecode.dev>
@carTloyal123 carTloyal123 force-pushed the feat/bedrock-dynamic-models branch from c2fefa1 to 9614c9f Compare May 17, 2026 22:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

type: feature Brand new functionality, features, pages, workflows, endpoints, etc. type: provider Updates provider.json configuration.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants