vLLM/SGLANG: add semantic degradation support by podkidyshev · Pull Request #890 · NVIDIA/cloudai

podkidyshev · 2026-05-13T15:05:20Z

Summary

Add semantic degradation support to vLLM and SGLANG

Test Plan

Automated CI
Manual runs

Additional Notes

coderabbitai · 2026-05-13T15:05:29Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 2bedbef7-785e-41b9-90aa-4552784f02af

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

✅ Review completed - (🔄 Check again to review again)

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch ipod/llm-semantic-degradation

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@conf/experimental/vllm/test/vllm.toml`:
- Around line 21-24: The git_repos entry in vllm.toml currently pins commit =
"main", which is mutable; change the commit value to an immutable revision (a
specific full commit SHA or a tagged release) so the entry under [[git_repos]]
(url, commit, mount_as) uses a fixed commit SHA instead of "main"; update commit
= "<full-commit-sha-or-tag>" to ensure reproducible builds and CI.

In `@src/cloudai/workloads/common/llm_serving.py`:
- Around line 633-643: _gen_semantic_eval_block currently emits the semantic
eval command raw, skipping the project's custom_bash wrapper; change it to build
the semantic command via get_semantic_eval_command() and then wrap the full srun
invocation using the existing helper _with_custom_bash(...) so the semantic eval
runs in the same custom_bash environment as other steps. Concretely, keep using
self.semantic_eval_log_file and self.test_run.output_path, but replace the raw
f-string return with a call that passes the srun_prefix and the joined
semantic_cmd through self._with_custom_bash(...) (mirroring how serve/bench use
custom_bash) so the final returned string is the wrapped command.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 9ca78aef-fa48-4a85-ae69-192d360d60a3

📥 Commits

Reviewing files that changed from the base of the PR and between 1099207 and b32782a.

📒 Files selected for processing (22)

README.md
conf/experimental/sglang/test/sglang.toml
conf/experimental/sglang/test_scenario/sglang.toml
conf/experimental/vllm/test/vllm.toml
conf/experimental/vllm/test_scenario/vllm.toml
doc/workloads/sglang.rst
doc/workloads/vllm.rst
src/cloudai/workloads/common/llm_serving.py
src/cloudai/workloads/sglang/__init__.py
src/cloudai/workloads/sglang/report_generation_strategy.py
src/cloudai/workloads/sglang/sglang.py
src/cloudai/workloads/sglang/slurm_command_gen_strategy.py
src/cloudai/workloads/vllm/__init__.py
src/cloudai/workloads/vllm/report_generation_strategy.py
src/cloudai/workloads/vllm/slurm_command_gen_strategy.py
src/cloudai/workloads/vllm/vllm.py
tests/workloads/sglang/test_command_gen_strategy_slurm.py
tests/workloads/sglang/test_job_status_retrieval_strategy.py
tests/workloads/sglang/test_report_gen_strategy.py
tests/workloads/vllm/test_command_gen_strategy_slurm.py
tests/workloads/vllm/test_job_status_retrieval_strategy.py
tests/workloads/vllm/test_report_gen_strategy.py

coderabbitai · 2026-05-20T21:51:33Z

+[[git_repos]]
+url = "https://github.com/vllm-project/vllm.git"
+commit = "main"
+mount_as = "/vllm_repo"


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Pin git_repos.commit to an immutable revision.

Line 23 uses main, which makes semantic-eval behavior drift over time and weakens reproducibility/debuggability for benchmark results and CI.

Suggested change

[[git_repos]] url = "https://github.com/vllm-project/vllm.git" -commit = "main" +commit = "<pinned-tag-or-sha>" mount_as = "/vllm_repo"

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@conf/experimental/vllm/test/vllm.toml` around lines 21 - 24, The git_repos entry in vllm.toml currently pins commit = "main", which is mutable; change the commit value to an immutable revision (a specific full commit SHA or a tagged release) so the entry under [[git_repos]] (url, commit, mount_as) uses a fixed commit SHA instead of "main"; update commit = "<full-commit-sha-or-tag>" to ensure reproducible builds and CI.

coderabbitai · 2026-05-20T21:51:33Z

+    def _gen_semantic_eval_block(self, srun_prefix: str) -> str:
+        semantic_cmd = self.get_semantic_eval_command()
+        if not semantic_cmd:
+            return ""
+
+        return f"""\
+
+echo "Running semantic validation..."
+{srun_prefix} \\
+    --output={(self.test_run.output_path / self.semantic_eval_log_file).absolute()} \\
+    {" ".join(semantic_cmd)}"""


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Apply custom_bash wrapping to semantic eval commands.

Semantic evaluation is the only launched step that bypasses _with_custom_bash(...). If custom_bash is used to set up runtime env, serve/bench can pass while semantic-eval fails or runs in a different environment.

💡 Proposed fix

def _gen_semantic_eval_block(self, srun_prefix: str) -> str: semantic_cmd = self.get_semantic_eval_command() if not semantic_cmd: return "" + semantic_tail = self._with_custom_bash(" ".join(semantic_cmd)) return f"""\ echo "Running semantic validation..." {srun_prefix} \\ --output={(self.test_run.output_path / self.semantic_eval_log_file).absolute()} \\ - {" ".join(semantic_cmd)}""" + {semantic_tail}"""

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def _gen_semantic_eval_block(self, srun_prefix: str) -> str:

semantic_cmd = self.get_semantic_eval_command()

if not semantic_cmd:

return ""

return f"""\

echo "Running semantic validation..."

{srun_prefix} \\

--output={(self.test_run.output_path / self.semantic_eval_log_file).absolute()} \\

{" ".join(semantic_cmd)}"""

def _gen_semantic_eval_block(self, srun_prefix: str) -> str:

semantic_cmd = self.get_semantic_eval_command()

if not semantic_cmd:

return ""

semantic_tail = self._with_custom_bash(" ".join(semantic_cmd))

return f"""\

echo "Running semantic validation..."

{srun_prefix} \\

--output={(self.test_run.output_path / self.semantic_eval_log_file).absolute()} \\

{semantic_tail}"""

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/cloudai/workloads/common/llm_serving.py` around lines 633 - 643, _gen_semantic_eval_block currently emits the semantic eval command raw, skipping the project's custom_bash wrapper; change it to build the semantic command via get_semantic_eval_command() and then wrap the full srun invocation using the existing helper _with_custom_bash(...) so the semantic eval runs in the same custom_bash environment as other steps. Concretely, keep using self.semantic_eval_log_file and self.test_run.output_path, but replace the raw f-string return with a call that passes the srun_prefix and the joined semantic_cmd through self._with_custom_bash(...) (mirroring how serve/bench use custom_bash) so the final returned string is the wrapped command.

vllm/sglang: add semantic degradation support

5429767

podkidyshev self-assigned this May 13, 2026

podkidyshev added the feature label May 13, 2026

podkidyshev added 4 commits May 20, 2026 22:01

Merge branch 'main' into ipod/llm-semantic-degradation

9eae558

update readme

7f1b7a7

vllm/sglang configs in the repo

b0e2425

adjust configs

b32782a

podkidyshev marked this pull request as ready for review May 20, 2026 21:47

podkidyshev requested review from jeffnvidia and srivatsankrishnan as code owners May 20, 2026 21:47

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vLLM/SGLANG: add semantic degradation support#890

vLLM/SGLANG: add semantic degradation support#890
podkidyshev wants to merge 5 commits into
mainfrom
ipod/llm-semantic-degradation

podkidyshev commented May 13, 2026

Uh oh!

coderabbitai Bot commented May 13, 2026 •

edited

Loading

Review skipped

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 20, 2026

Uh oh!

coderabbitai Bot May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

podkidyshev commented May 13, 2026

Summary

Test Plan

Additional Notes

Uh oh!

coderabbitai Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented May 13, 2026 •

edited

Loading