[Configurator] Add GymnasiumAdapter for CloudAI envs by rutayan-nv · Pull Request #894 · NVIDIA/cloudai

rutayan-nv · 2026-05-15T21:13:50Z

GymnasiumAdapter wraps a CloudAI BaseGym as a gymnasium-compatible env: Dict of Discrete actions over the tunable params, float32 Box observation derived from define_observation_space(), gymnasium 5-tuple from step / step_raw.
Fixed (single-value) params are injected on every step so agents only ever see the tunable subset; step_raw bypasses index decoding and fixed injection for callers that already have a fully-formed parameter dict.
The adapter mirrors handle_dse_job's 1-based test_run.step so artifact paths match whether the env is driven by the DSE loop or by an external training loop wrapping the adapter.
CloudAIGymEnv.define_observation_space() now returns one slot per agent metric (at least one), giving adapters the correct Box shape.
gymnasium is an optional dependency: lazy-imported inside the adapter and exposed via a new rl extra (also added to dev) pinned to ~=1.2.

coderabbitai · 2026-05-15T21:14:04Z

📝 Walkthrough

Walkthrough

Adds GymnasiumAdapter wrapping BaseGym to expose Gymnasium-style action/observation spaces and APIs; makes CloudAIGym observation size metric-driven; exports the adapter; adds optional gymnasium dev/rl extras and a comprehensive pytest suite for the adapter.

Changes

Gymnasium adapter integration

Layer / File(s)	Summary
CloudAIGymEnv observation-space contract `src/cloudai/configurator/cloudai_gym.py`	`define_observation_space()` returns a float list sized to `max(len(agent_metrics), 1)` and `reset()` uses it to produce initial observations instead of a fixed placeholder.
GymnasiumAdapter implementation `src/cloudai/configurator/gymnasium_adapter.py`	New adapter with lazy import guard for gymnasium/numpy; builds `action_space` as `spaces.Dict` of `spaces.Discrete` for tunable params (excluding single-value fixed params), `observation_space` as `spaces.Box` of `float32` shaped from `env.define_observation_space()`; implements `decode_action`, `reset`, `step`, `step_raw`, `render`, key validation, test-run step synchronization, and observation conversion.
Module exports and dependencies `pyproject.toml`, `src/cloudai/configurator/__init__.py`	Adds `gymnasium~=1.2` to dev and an `rl` extra depending on it; exports `GymnasiumAdapter` from the configurator package.
GymnasiumAdapter test suite `tests/test_gymnasium_adapter.py`	Deterministic fake environments and fixtures validate action/observation spaces, `reset`/`step` outputs and dtypes, action decoding, fixed-parameter injection, `step_raw` behavior, `test_run.step` synchronization, and missing-dependency/input-validation error messages.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I wrapped the gym with careful paws,
Discrete choices hop, fixed ones stay small.
Steps count from one, observations align,
Float32 arrays march in tidy rows.
A rabbit applauds this adapter call.

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately and concisely summarizes the main change: adding GymnasiumAdapter for CloudAI environments.
Description check	✅ Passed	The description is directly related to the changeset, detailing the GymnasiumAdapter implementation, fixed parameters handling, step numbering, observation space changes, and optional dependency setup.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cloudai/configurator/gymnasium_adapter.py`:
- Around line 92-112: decode_action and step_raw currently allow partial or
unexpected parameter sets which can silently produce incorrect runs; update
decode_action, step, and step_raw to validate that the final parameter keys
exactly match the expected set (the union of self._tunable_params.keys() and
self._fixed_params.keys()) before calling _step_with_params, and raise a clear
exception (e.g., ValueError) listing missing or unexpected keys if validation
fails so the caller fails fast; enforce the same check when step builds params
from decode_action and when step_raw is passed params directly, and reference
the symbols decode_action, step, step_raw, self._tunable_params,
self._fixed_params, and _step_with_params to locate the changes.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 9474faf6-0f97-42f8-82a0-1e23932471bc

📥 Commits

Reviewing files that changed from the base of the PR and between 4bdc465 and 71ddd51.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (5)

pyproject.toml
src/cloudai/configurator/__init__.py
src/cloudai/configurator/cloudai_gym.py
src/cloudai/configurator/gymnasium_adapter.py
tests/test_gymnasium_adapter.py

- GymnasiumAdapter wraps a CloudAI BaseGym as a gymnasium-compatible env: Dict of Discrete actions over the tunable params, float32 Box observation derived from define_observation_space(), gymnasium 5-tuple from step/step_raw. - Fixed (single-value) params are injected on every step so agents only ever see the tunable subset; step_raw bypasses index decoding and fixed injection for callers that already have a fully-formed parameter dict. - The adapter mirrors handle_dse_job's 1-based test_run.step so artifact paths match whether the env is driven by the DSE loop or an external training loop. - CloudAIGymEnv.define_observation_space() now returns one slot per agent metric (at least one), giving adapters the correct Box shape. - gymnasium is an optional dependency: lazy-imported inside the adapter and exposed via a new rl extra (also added to dev) pinned to ~=1.2.

decode_action used to accept partial action dicts and step_raw accepted arbitrary key sets, both silently propagating incomplete params to the underlying env. Validate up front: - decode_action requires keys exactly == self._tunable_params, and each discrete index must be in range. - step_raw requires keys exactly == self._tunable_params | self._fixed_params. A new _assert_keys helper raises ValueError with sorted missing / extra lists so callers see exactly what was wrong. Adds tests for missing key, unknown key, out-of-range index, missing fixed param, and unknown raw key paths.

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cloudai/configurator/cloudai_gym.py`:
- Around line 79-82: The constraint-failure branch returns a hardcoded
single-element observation that violates the declared observation shape; update
the branch to return the same-width list as the normal case by using the
metric-derived width (use max(len(self.test_run.test.agent_metrics), 1)) instead
of a single 0.0 so that both the regular return (currently using [0.0] *
max(len(self.test_run.test.agent_metrics), 1)) and the constraint-failure branch
produce identically shaped observations.

In `@tests/test_gymnasium_adapter.py`:
- Around line 209-215: The tests assert on ValueError messages using exact list
formatting (e.g., match=r"missing=\['param_b'\]"), which is brittle; update the
assertions in test_decode_action_missing_key and
test_decode_action_rejects_unknown_keys to use more focused regexes that only
check for the presence of the key names (e.g., look for "param_b" and "bogus")
or words like "missing" / "extra" rather than exact list punctuation; adjust the
two places mentioned (the current block around adapter.decode_action({"param_a":
0}) and the block for adapter.decode_action({"param_a": 0, "param_b": 1,
"bogus": 0}) and the similar assertions at lines 226-234) to use these looser
key-focused regexes so tests validate behavior without depending on formatting.
- Around line 131-136: Remove the test's assertion on the private attribute
adapter._fixed_params and instead verify public behavior: keep the assertion
that adapter.action_space.spaces equals {"param_a","param_b"}, then call
GymnasiumAdapter.step (using the _FixedParamGym instance which records received
actions) with an action containing only "param_a"/"param_b" and assert the
environment received/processed an action that includes the fixed parameter (e.g.
the recorded last_action or returned observation/info shows "fixed_param": 42);
this exercises action-space exclusion plus the merged action behavior without
relying on the private _fixed_params field.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: bf1b3854-aa01-4f65-8aeb-7699901ac890

📥 Commits

Reviewing files that changed from the base of the PR and between 9937ef3 and 78b1bb3.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (5)

pyproject.toml
src/cloudai/configurator/__init__.py
src/cloudai/configurator/cloudai_gym.py
src/cloudai/configurator/gymnasium_adapter.py
tests/test_gymnasium_adapter.py

coderabbitai · 2026-05-18T16:36:55Z

+            list: One float slot per agent metric (at least one), giving the correct shape
+            for adapters that derive ``gymnasium.spaces.Box`` from this output.
        """
-        return [0.0]
+        return [0.0] * max(len(self.test_run.test.agent_metrics), 1)


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Return constraint-failure observations with the declared shape.

Line 82 makes observation width metric-driven, but the constraint-failure branch still returns a hardcoded single-element observation (Line 143). That breaks the fixed observation-shape contract expected by Gymnasium consumers.

Suggested fix

- return [-1.0], self.rewards.constraint_failure, True, {} + fallback_obs = [-1.0] * len(self.define_observation_space()) + return fallback_obs, self.rewards.constraint_failure, True, {}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cloudai/configurator/cloudai_gym.py` around lines 79 - 82, The constraint-failure branch returns a hardcoded single-element observation that violates the declared observation shape; update the branch to return the same-width list as the normal case by using the metric-derived width (use max(len(self.test_run.test.agent_metrics), 1)) instead of a single 0.0 so that both the regular return (currently using [0.0] * max(len(self.test_run.test.agent_metrics), 1)) and the constraint-failure branch produce identically shaped observations.

coderabbitai · 2026-05-18T16:36:55Z

+def test_single_value_params_are_excluded_from_action_space() -> None:
+    adapter = GymnasiumAdapter(_FixedParamGym())
+
+    assert set(adapter.action_space.spaces) == {"param_a", "param_b"}
+    assert adapter._fixed_params == {"fixed_param": 42}
+


🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Avoid asserting private adapter internals in tests.

Line 135 couples the test to adapter._fixed_params, which is implementation detail. Assert only public behavior (action-space exclusion + merged action on step) to keep tests stable across refactors.

Suggested diff

def test_single_value_params_are_excluded_from_action_space() -> None: adapter = GymnasiumAdapter(_FixedParamGym()) assert set(adapter.action_space.spaces) == {"param_a", "param_b"} - assert adapter._fixed_params == {"fixed_param": 42}

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def test_single_value_params_are_excluded_from_action_space() -> None:

adapter = GymnasiumAdapter(_FixedParamGym())

assert set(adapter.action_space.spaces) == {"param_a", "param_b"}

assert adapter._fixed_params == {"fixed_param": 42}

def test_single_value_params_are_excluded_from_action_space() -> None:

adapter = GymnasiumAdapter(_FixedParamGym())

assert set(adapter.action_space.spaces) == {"param_a", "param_b"}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_gymnasium_adapter.py` around lines 131 - 136, Remove the test's assertion on the private attribute adapter._fixed_params and instead verify public behavior: keep the assertion that adapter.action_space.spaces equals {"param_a","param_b"}, then call GymnasiumAdapter.step (using the _FixedParamGym instance which records received actions) with an action containing only "param_a"/"param_b" and assert the environment received/processed an action that includes the fixed parameter (e.g. the recorded last_action or returned observation/info shows "fixed_param": 42); this exercises action-space exclusion plus the merged action behavior without relying on the private _fixed_params field.

coderabbitai · 2026-05-18T16:36:56Z

+    with pytest.raises(ValueError, match=r"missing=\['param_b'\]"):
+        adapter.decode_action({"param_a": 0})
+
+
+def test_decode_action_rejects_unknown_keys(adapter: GymnasiumAdapter) -> None:
+    with pytest.raises(ValueError, match=r"extra=\['bogus'\]"):
+        adapter.decode_action({"param_a": 0, "param_b": 1, "bogus": 0})


🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Relax exception-message matching to avoid brittle formatting dependencies.

These assertions currently depend on exact list representation/order in error strings. Prefer key-focused regex so tests validate behavior, not message formatting details.

Suggested diff

def test_decode_action_rejects_missing_keys(adapter: GymnasiumAdapter) -> None: - with pytest.raises(ValueError, match=r"missing=\['param_b'\]"): + with pytest.raises(ValueError, match=r"missing=.*param_b"): adapter.decode_action({"param_a": 0}) def test_decode_action_rejects_unknown_keys(adapter: GymnasiumAdapter) -> None: - with pytest.raises(ValueError, match=r"extra=\['bogus'\]"): + with pytest.raises(ValueError, match=r"extra=.*bogus"): adapter.decode_action({"param_a": 0, "param_b": 1, "bogus": 0}) @@ def test_step_raw_rejects_missing_fixed_param() -> None: @@ - with pytest.raises(ValueError, match=r"missing=\['fixed_param'\]"): + with pytest.raises(ValueError, match=r"missing=.*fixed_param"): adapter.step_raw({"param_a": 1, "param_b": 10}) @@ def test_step_raw_rejects_unknown_keys() -> None: @@ - with pytest.raises(ValueError, match=r"extra=\['bogus'\]"): + with pytest.raises(ValueError, match=r"extra=.*bogus"): adapter.step_raw({"param_a": 1, "param_b": 10, "fixed_param": 42, "bogus": 0})

Also applies to: 226-234

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_gymnasium_adapter.py` around lines 209 - 215, The tests assert on ValueError messages using exact list formatting (e.g., match=r"missing=\['param_b'\]"), which is brittle; update the assertions in test_decode_action_missing_key and test_decode_action_rejects_unknown_keys to use more focused regexes that only check for the presence of the key names (e.g., look for "param_b" and "bogus") or words like "missing" / "extra" rather than exact list punctuation; adjust the two places mentioned (the current block around adapter.decode_action({"param_a": 0}) and the block for adapter.decode_action({"param_a": 0, "param_b": 1, "bogus": 0}) and the similar assertions at lines 226-234) to use these looser key-focused regexes so tests validate behavior without depending on formatting.

podkidyshev · 2026-05-20T09:24:13Z

+
+from .base_gym import BaseGym
+
+_GYMNASIUM_INSTALL_HINT = "gymnasium is required for GymnasiumAdapter. Install it with: pip install gymnasium"


Suggested change

_GYMNASIUM_INSTALL_HINT = "gymnasium is required for GymnasiumAdapter. Install it with: pip install gymnasium"

_GYMNASIUM_INSTALL_HINT = "gymnasium is required for GymnasiumAdapter. Install it with: uv sync --extra rl"

podkidyshev · 2026-05-20T09:25:16Z

+_GYMNASIUM_INSTALL_HINT = "gymnasium is required for GymnasiumAdapter. Install it with: pip install gymnasium"
+
+
+def _import_gymnasium():


please introduce/move it into src/cloudai/util/lazy_imports.py. numpy is already there, just add gymnasium and maybe optionally spaces as a separate property

podkidyshev · 2026-05-20T09:27:13Z

+    def __init__(self, env: BaseGym) -> None:
+        _, spaces, np = _import_gymnasium()
+
+        self._np = np


please do not keep module as a variable, use it when needed from lazy.np.<smth> with from cloudai.util.lazy_imports import lazy

podkidyshev · 2026-05-20T10:01:49Z

+        """
+        test_run = getattr(self._env, "test_run", None)
+        if test_run is not None:
+            test_run.step = self._step_count + 1


who's responsible for steps budget now? for standard argents it's handle_dse_job loop with agent.max_steps

apparently this is the only places that anyhow touches cloudai internals. if we solve it somehow - I suggest the adapter is to be put not in cloudai but in the private repo.

if the only consumer is our private RL agents, I’d prefer this live with those agents, while CloudAI exposes only the minimal env/execution hooks they need. Moving it to core makes Gymnasium compatibility look like a supported CloudAI API, but we do not yet have the integration tests or API ownership story for that

rutayan-nv requested review from jeffnvidia, podkidyshev and srivatsankrishnan as code owners May 15, 2026 21:13

This was referenced May 15, 2026

Gym enhancements #863

Draft

More Gym enhancements #884

Draft

coderabbitai Bot reviewed May 15, 2026

View reviewed changes

Comment thread src/cloudai/configurator/gymnasium_adapter.py

rutayan-nv changed the title ~~feat(configurator): add GymnasiumAdapter for CloudAI envs~~ [Configurator] Add GymnasiumAdapter for CloudAI envs May 15, 2026

rutayan-nv added 2 commits May 18, 2026 12:33

rutayan-nv force-pushed the rpatro/gymnasium-adapter branch from 9937ef3 to 78b1bb3 Compare May 18, 2026 16:33

coderabbitai Bot reviewed May 18, 2026

View reviewed changes

podkidyshev requested changes May 20, 2026

View reviewed changes


		from .base_gym import BaseGym

		_GYMNASIUM_INSTALL_HINT = "gymnasium is required for GymnasiumAdapter. Install it with: pip install gymnasium"

		_GYMNASIUM_INSTALL_HINT = "gymnasium is required for GymnasiumAdapter. Install it with: pip install gymnasium"


		def _import_gymnasium():

Conversation

rutayan-nv commented May 15, 2026

Uh oh!

coderabbitai Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

podkidyshev May 20, 2026

Choose a reason for hiding this comment

Uh oh!

podkidyshev May 20, 2026

Choose a reason for hiding this comment

Uh oh!

podkidyshev May 20, 2026

Choose a reason for hiding this comment

Uh oh!

podkidyshev May 20, 2026

Choose a reason for hiding this comment

Uh oh!

podkidyshev May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai Bot commented May 15, 2026 •

edited

Loading