vLLM/SGLANG: add semantic degradation support#890
Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Enterprise Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@conf/experimental/vllm/test/vllm.toml`:
- Around line 21-24: The git_repos entry in vllm.toml currently pins commit =
"main", which is mutable; change the commit value to an immutable revision (a
specific full commit SHA or a tagged release) so the entry under [[git_repos]]
(url, commit, mount_as) uses a fixed commit SHA instead of "main"; update commit
= "<full-commit-sha-or-tag>" to ensure reproducible builds and CI.
In `@src/cloudai/workloads/common/llm_serving.py`:
- Around line 633-643: _gen_semantic_eval_block currently emits the semantic
eval command raw, skipping the project's custom_bash wrapper; change it to build
the semantic command via get_semantic_eval_command() and then wrap the full srun
invocation using the existing helper _with_custom_bash(...) so the semantic eval
runs in the same custom_bash environment as other steps. Concretely, keep using
self.semantic_eval_log_file and self.test_run.output_path, but replace the raw
f-string return with a call that passes the srun_prefix and the joined
semantic_cmd through self._with_custom_bash(...) (mirroring how serve/bench use
custom_bash) so the final returned string is the wrapped command.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: 9ca78aef-fa48-4a85-ae69-192d360d60a3
📒 Files selected for processing (22)
README.mdconf/experimental/sglang/test/sglang.tomlconf/experimental/sglang/test_scenario/sglang.tomlconf/experimental/vllm/test/vllm.tomlconf/experimental/vllm/test_scenario/vllm.tomldoc/workloads/sglang.rstdoc/workloads/vllm.rstsrc/cloudai/workloads/common/llm_serving.pysrc/cloudai/workloads/sglang/__init__.pysrc/cloudai/workloads/sglang/report_generation_strategy.pysrc/cloudai/workloads/sglang/sglang.pysrc/cloudai/workloads/sglang/slurm_command_gen_strategy.pysrc/cloudai/workloads/vllm/__init__.pysrc/cloudai/workloads/vllm/report_generation_strategy.pysrc/cloudai/workloads/vllm/slurm_command_gen_strategy.pysrc/cloudai/workloads/vllm/vllm.pytests/workloads/sglang/test_command_gen_strategy_slurm.pytests/workloads/sglang/test_job_status_retrieval_strategy.pytests/workloads/sglang/test_report_gen_strategy.pytests/workloads/vllm/test_command_gen_strategy_slurm.pytests/workloads/vllm/test_job_status_retrieval_strategy.pytests/workloads/vllm/test_report_gen_strategy.py
| [[git_repos]] | ||
| url = "https://github.com/vllm-project/vllm.git" | ||
| commit = "main" | ||
| mount_as = "/vllm_repo" |
There was a problem hiding this comment.
Pin git_repos.commit to an immutable revision.
Line 23 uses main, which makes semantic-eval behavior drift over time and weakens reproducibility/debuggability for benchmark results and CI.
Suggested change
[[git_repos]]
url = "https://github.com/vllm-project/vllm.git"
-commit = "main"
+commit = "<pinned-tag-or-sha>"
mount_as = "/vllm_repo"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@conf/experimental/vllm/test/vllm.toml` around lines 21 - 24, The git_repos
entry in vllm.toml currently pins commit = "main", which is mutable; change the
commit value to an immutable revision (a specific full commit SHA or a tagged
release) so the entry under [[git_repos]] (url, commit, mount_as) uses a fixed
commit SHA instead of "main"; update commit = "<full-commit-sha-or-tag>" to
ensure reproducible builds and CI.
| def _gen_semantic_eval_block(self, srun_prefix: str) -> str: | ||
| semantic_cmd = self.get_semantic_eval_command() | ||
| if not semantic_cmd: | ||
| return "" | ||
|
|
||
| return f"""\ | ||
|
|
||
| echo "Running semantic validation..." | ||
| {srun_prefix} \\ | ||
| --output={(self.test_run.output_path / self.semantic_eval_log_file).absolute()} \\ | ||
| {" ".join(semantic_cmd)}""" |
There was a problem hiding this comment.
Apply custom_bash wrapping to semantic eval commands.
Semantic evaluation is the only launched step that bypasses _with_custom_bash(...). If custom_bash is used to set up runtime env, serve/bench can pass while semantic-eval fails or runs in a different environment.
💡 Proposed fix
def _gen_semantic_eval_block(self, srun_prefix: str) -> str:
semantic_cmd = self.get_semantic_eval_command()
if not semantic_cmd:
return ""
+ semantic_tail = self._with_custom_bash(" ".join(semantic_cmd))
return f"""\
echo "Running semantic validation..."
{srun_prefix} \\
--output={(self.test_run.output_path / self.semantic_eval_log_file).absolute()} \\
- {" ".join(semantic_cmd)}"""
+ {semantic_tail}"""📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def _gen_semantic_eval_block(self, srun_prefix: str) -> str: | |
| semantic_cmd = self.get_semantic_eval_command() | |
| if not semantic_cmd: | |
| return "" | |
| return f"""\ | |
| echo "Running semantic validation..." | |
| {srun_prefix} \\ | |
| --output={(self.test_run.output_path / self.semantic_eval_log_file).absolute()} \\ | |
| {" ".join(semantic_cmd)}""" | |
| def _gen_semantic_eval_block(self, srun_prefix: str) -> str: | |
| semantic_cmd = self.get_semantic_eval_command() | |
| if not semantic_cmd: | |
| return "" | |
| semantic_tail = self._with_custom_bash(" ".join(semantic_cmd)) | |
| return f"""\ | |
| echo "Running semantic validation..." | |
| {srun_prefix} \\ | |
| --output={(self.test_run.output_path / self.semantic_eval_log_file).absolute()} \\ | |
| {semantic_tail}""" |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/cloudai/workloads/common/llm_serving.py` around lines 633 - 643,
_gen_semantic_eval_block currently emits the semantic eval command raw, skipping
the project's custom_bash wrapper; change it to build the semantic command via
get_semantic_eval_command() and then wrap the full srun invocation using the
existing helper _with_custom_bash(...) so the semantic eval runs in the same
custom_bash environment as other steps. Concretely, keep using
self.semantic_eval_log_file and self.test_run.output_path, but replace the raw
f-string return with a call that passes the srun_prefix and the joined
semantic_cmd through self._with_custom_bash(...) (mirroring how serve/bench use
custom_bash) so the final returned string is the wrapped command.
Summary
Test Plan
Additional Notes