Skip to content

test(plugin-native,server): add unit tests for metrics, loader, auth handlers, and websocket dispatch#474

Open
staging-devin-ai-integration[bot] wants to merge 1 commit into
mainfrom
devin/1779042393-cov-phase3-plugin-native-server
Open

test(plugin-native,server): add unit tests for metrics, loader, auth handlers, and websocket dispatch#474
staging-devin-ai-integration[bot] wants to merge 1 commit into
mainfrom
devin/1779042393-cov-phase3-plugin-native-server

Conversation

@staging-devin-ai-integration
Copy link
Copy Markdown
Contributor

@staging-devin-ai-integration staging-devin-ai-integration Bot commented May 17, 2026

Summary

Phase 3 coverage initiative — Stream C. Test-only PR. No production code is touched.

Adds inline #[cfg(test)] mod tests modules to four previously-untested source files:

File Tests added Highlights
crates/plugin-native/src/metrics.rs 9 CallOutcome Copy/Debug, PluginMetrics::new/default construction, record() for every outcome incl. record_timeout(), build_labels() ordering + owned-string round-trip, extreme histogram inputs (f64::MIN_POSITIVE, 120 s)
crates/plugin-native/src/lib.rs 9 namespaced_kind() validation matrix (plain, idempotent prefix, rejected ::, reserved core::, empty), register_plugins() empty case, PluginMetadata Debug/Clone, LoadedNativePlugin::load() error paths for missing and non-library files via a tempfile-backed .so (no real artifact loaded)
apps/skit/src/auth/handlers.rs 29 require_admin() matrix, DTO serde shape, /me + /login + /logout + /tokens + revoke under both AuthMode::Disabled and AuthMode::Enabled, self-revocation prevention, reload-keys round-trip, moq-gated token endpoints
apps/skit/src/websocket.rs 11 max_ws_message_bytes() default, WebSocketMetrics::shared() singleton, invalid-JSON keeps connection open with typed error, oversized text/binary close paths, valid request round-trip with correlation_id, clean client close, broadcast event filtering by access_all_sessions in both directions

Auth + WebSocket tests use lightweight in-memory fixtures (tempfile::TempDir, tokio::test(flavor = "multi_thread"), in-process Axum routers on 127.0.0.1:0 driven by real tokio-tungstenite clients) rather than mocks, per agent_docs/coverage.md guidance. The only new dependency is tempfile as a dev-dep on crates/plugin-native.

Local verification

  • cargo test -p streamkit-plugin-native60 passed
  • cargo test -p streamkit-server --lib --features moq376 passed
  • cargo test -p streamkit-server --lib --features mcp376 passed
  • cargo fmt --all -- --check → clean
  • cargo clippy -p streamkit-server --all-targets --features moq -- -D warnings → clean
  • cargo clippy -p streamkit-server --all-targets --features mcp -- -D warnings → clean
  • cargo clippy --workspace --exclude streamkit-server --all-targets -- -D warnings → clean
  • cargo deny check licenseslicenses ok

Bugs spotted while writing tests

None. No production behavior was modified or pinned-around.

Review & Testing Checklist for Human

Risk: green — tests-only, no production code touched.

  • Confirm CI passes on both moq and mcp feature matrices for streamkit-server.
  • Spot-check that the WebSocket integration-style tests (which bind a loopback TcpListener) don't flake on CI — they include short tokio::time::sleep yields after axum::serve spawn but no longer-than-necessary timeouts.
  • Sanity-check the crates/plugin-native/Cargo.toml dev-dep addition (tempfile = "3.27") matches the version used elsewhere in the workspace.

Notes

  • crates/plugin-native/src/wrapper.rs was intentionally not modified — it already carries #[cfg(test)] coverage and was explicitly excluded from Phase 3 Stream C.
  • The websocket tests deliberately exercise the gating predicate inside handle_websocket rather than the broadcaster filter in isolation, so a future refactor that moves the predicate out of the handler will still be covered.
  • Parent session: https://staging.itsdev.in/sessions/a088e863cbbd473b9be4a57529fc674b

Link to Devin session: https://staging.itsdev.in/sessions/e03778f4112e498fbfcc25d931718314
Requested by: @streamer45


Devin Review

Status Commit
🟢 Reviewed b64597a
Open in Devin Review (Staging)

…handlers, and websocket dispatch

Phase 3 coverage initiative — Stream C.

Adds inline #[cfg(test)] modules to four previously-untested source files
without touching any production code:

- crates/plugin-native/src/metrics.rs: 9 tests covering CallOutcome variants
  (Copy, Debug), PluginMetrics construction, record() for every outcome,
  record_timeout(), build_labels() ordering/cloning, and extreme histogram
  inputs (f64::MIN_POSITIVE, 120s).

- crates/plugin-native/src/lib.rs: 9 tests covering the namespaced_kind()
  validation matrix (plain name, idempotent prefix, rejected '::', reserved
  core:: prefix, empty input), register_plugins() with an empty input,
  PluginMetadata Debug/Clone, and LoadedNativePlugin::load() error paths
  for missing and non-library files via a tempfile-backed fixture. Avoids
  loading any real .so artifact through libloading.

- apps/skit/src/auth/handlers.rs: 29 tests covering require_admin(), DTO
  serde shape, /me + /login + /logout + /tokens + revoke endpoints under
  both AuthMode::Disabled and AuthMode::Enabled, self-revocation
  prevention, reload-keys round-trips, and moq-gated token endpoints.
  Uses tempfile::TempDir + tokio multi-thread fixtures instead of mocks.

- apps/skit/src/websocket.rs: 11 tests covering max_ws_message_bytes()
  default, WebSocketMetrics::shared() singleton, invalid-JSON keeps
  connection open + emits typed error, oversized text/binary close paths,
  valid request round-trip with correlation_id, clean client close, and
  broadcast event filtering by access_all_sessions in both directions.
  Spawns lightweight in-memory Axum routers on 127.0.0.1:0 and drives
  them via tokio-tungstenite.

crates/plugin-native gains a tempfile dev-dependency.

Parent session: https://staging.itsdev.in/sessions/a088e863cbbd473b9be4a57529fc674b

Signed-off-by: Staging-Devin AI <166158716+staging-devin-ai-integration[bot]@users.noreply.github.com>
Co-Authored-By: Claudio Costa <cstcld91@gmail.com>
@staging-devin-ai-integration
Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

Copy link
Copy Markdown
Contributor Author

@staging-devin-ai-integration staging-devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 8 potential issues.

Open in Devin Review (Staging)
Debug

Playground

Comment on lines +529 to +530
// Leak temp dir to keep state_dir paths valid for the whole test.
std::mem::forget(temp);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Test helper intentionally leaks every disabled-auth TempDir

make_state_auth_disabled calls std::mem::forget(temp), so every test using this helper leaves its temporary state directory on disk instead of letting TempDir clean it up. The comment notes the disabled auth state does not read from the directory, but the helper is used by several auth handler tests (apps/skit/src/auth/handlers.rs:637, apps/skit/src/auth/handlers.rs:654, apps/skit/src/auth/handlers.rs:675, etc.), so repeated local/CI runs accumulate leaked directories unnecessarily.

Prompt for agents
Fix the disabled-auth test helper in apps/skit/src/auth/handlers.rs so temporary directories are cleaned up instead of being leaked with std::mem::forget. Either avoid creating a TempDir for disabled auth if create_app_state does not need a live auth.state_dir, or return/hold the TempDir from the helper the same way make_state_auth_enabled does so it is dropped at the end of each test.
Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Comment on lines +647 to +660
let mut ws = connect(addr).await;
tokio::time::sleep(std::time::Duration::from_millis(50)).await;

let event = Message {
message_type: MessageType::Event,
correlation_id: None,
payload: EventPayload::NodeStateChanged {
session_id: "any-session".into(),
node_id: "n1".into(),
state: streamkit_core::NodeState::Running,
timestamp: "1970-01-01T00:00:00Z".into(),
},
};
let _ = state.event_tx.send(BroadcastEvent::to_all(event));
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 WebSocket broadcast tests race the handler subscription with fixed sleeps

The new broadcast tests rely on fixed tokio::time::sleep delays before publishing events, but the WebSocket upgrade only guarantees the client handshake completed; it does not guarantee handle_websocket has run far enough to call app_state.event_tx.subscribe() at apps/skit/src/websocket.rs:141. On slow CI, the admin test can publish before the handler subscribes and then time out waiting for an event; the filtered-viewer test can also pass vacuously because the handler missed the broadcast rather than because filtering worked.

Prompt for agents
Remove the timing dependency from the WebSocket broadcast tests in apps/skit/src/websocket.rs. Add deterministic synchronization that proves the server-side handler has subscribed before the test publishes to state.event_tx, such as an explicit readiness channel in the test-only route wrapper, or a request/response round trip through the same WebSocket after connection and before broadcasting. Then publish the event only after that readiness signal instead of relying on fixed sleeps.
Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground


#[cfg(test)]
mod tests {
#![allow(clippy::unwrap_used, clippy::expect_used)]
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Clippy suppression in auth handler tests lacks the required rationale

AGENTS.md explicitly requires every lint suppression to include a rationale comment, but this test module adds #![allow(clippy::unwrap_used, clippy::expect_used)] without explaining why the suppression is necessary. This violates the repository’s mandatory linting discipline for the newly added auth handler tests.

Suggested change
#![allow(clippy::unwrap_used, clippy::expect_used)]
// Test setup intentionally uses unwrap/expect so failures point at the failed precondition.
#![allow(clippy::unwrap_used, clippy::expect_used)]
Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground


#[cfg(test)]
mod tests {
#![allow(clippy::unwrap_used, clippy::expect_used)]
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Clippy suppression in WebSocket tests lacks the required rationale

AGENTS.md explicitly requires every lint suppression to include a rationale comment, but this test module adds #![allow(clippy::unwrap_used, clippy::expect_used)] without explaining why the suppression is necessary. This violates the repository’s mandatory linting discipline for the newly added WebSocket tests.

Suggested change
#![allow(clippy::unwrap_used, clippy::expect_used)]
// Test setup intentionally uses unwrap/expect so failures point at the failed precondition.
#![allow(clippy::unwrap_used, clippy::expect_used)]
Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground


#[cfg(test)]
mod tests {
#![allow(clippy::unwrap_used, clippy::expect_used)]
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Clippy suppression in native plugin tests lacks the required rationale

AGENTS.md explicitly requires every lint suppression to include a rationale comment, but this test module adds #![allow(clippy::unwrap_used, clippy::expect_used)] without explaining why the suppression is necessary. This violates the repository’s mandatory linting discipline for the newly added native plugin tests.

Suggested change
#![allow(clippy::unwrap_used, clippy::expect_used)]
// Test setup intentionally uses unwrap/expect so failures point at the failed precondition.
#![allow(clippy::unwrap_used, clippy::expect_used)]
Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

router.layer(DefaultBodyLimit::max(AUTH_MAX_BODY_BYTES))
}

#[cfg(test)]
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: Most of the PR is test-only, so runtime behavior risk is limited

The diff adds tests and one dev-dependency, while the surrounding auth handlers, WebSocket handler, native plugin loader, and plugin metrics implementations are unchanged. That means the runtime bug surface is mostly indirect: flaky tests, resource cleanup in tests, and whether the tests encode inaccurate assumptions about existing behavior rather than new production regressions.

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Comment on lines +136 to +154
#[test]
fn record_handles_every_outcome() {
let metrics = PluginMetrics::new();
let labels = PluginMetrics::build_labels("test_kind", "process_packet");

// Each outcome must traverse the corresponding match arm without panicking.
// Against the default no-op meter provider these are observable only via
// "did not panic"; under a real SdkMeterProvider they would also bump the
// backing counters.
metrics.record(&labels, 0.0, CallOutcome::Success);
metrics.record(&labels, 0.025, CallOutcome::Error);
metrics.record(&labels, 1.5, CallOutcome::Panic);
}

#[test]
fn record_timeout_does_not_panic() {
let metrics = PluginMetrics::new();
let labels = PluginMetrics::build_labels("any", "flush");
metrics.record_timeout(&labels);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: Metrics tests are smoke tests rather than counter assertions

The new plugin metrics tests construct instruments and call record/record_timeout, but they do not install an SDK meter reader or assert exported counter/histogram values. This is not a bug because the default global provider is effectively enough for “does not panic” coverage, but reviewers should treat these tests as API/smoke coverage rather than verification that each metric series is emitted with the expected value.

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Comment on lines +534 to +560
/// Confirms that `load()` surfaces a wrapped libloading error (not a panic
/// or silent fallback) when given a path that cannot be opened. Keeps the
/// expected error-message shape stable without requiring a real .so.
#[test]
fn load_returns_error_for_missing_library() {
let result = LoadedNativePlugin::load("/this/path/definitely/does/not/exist.so");
let Err(err) = result else { panic!("expected error for missing path") };
let msg = err.to_string();
assert!(msg.starts_with("Failed to load library"), "wrapped libloading error: {msg}");
assert!(msg.contains("/this/path/definitely/does/not/exist.so"));
}

/// load() must error on a path that points at a non-library file rather than
/// proceeding into symbol lookup with garbage memory. Uses a tempfile so the
/// failure source is "not a dynamic library" rather than "file not found".
#[test]
fn load_returns_error_for_non_library_file() {
let mut tmp = tempfile::Builder::new().suffix(".so").tempfile().expect("create tempfile");
std::io::Write::write_all(&mut tmp, b"not a real shared object").expect("write tempfile");
let path = tmp.path().to_path_buf();

let Err(err) = LoadedNativePlugin::load(&path) else {
panic!("expected error for non-library file");
};
let msg = err.to_string();
assert!(msg.starts_with("Failed to load library"), "wrapped libloading error: {msg}");
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: Native plugin load tests intentionally validate only early loader failures

The missing-library and non-library-file tests exercise the Library::new failure path and confirm the host wraps the libloading error. They do not cover symbol lookup, API version checks, metadata extraction, or instance creation because those would require a real test dynamic library matching the native plugin ABI. That narrower scope is reasonable for these additions, but it is worth noting the loader’s more security-sensitive ABI paths remain covered elsewhere or not by this PR.

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

@codecov
Copy link
Copy Markdown

codecov Bot commented May 17, 2026

Codecov Report

❌ Patch coverage is 98.91304% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 66.80%. Comparing base (fad6274) to head (b64597a).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
apps/skit/src/websocket.rs 96.32% 9 Missing ⚠️
crates/plugin-native/src/lib.rs 98.63% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #474      +/-   ##
==========================================
+ Coverage   65.91%   66.80%   +0.89%     
==========================================
  Files         217      217              
  Lines       57530    58450     +920     
  Branches     1597     1597              
==========================================
+ Hits        37922    39049    +1127     
+ Misses      19602    19395     -207     
  Partials        6        6              
Flag Coverage Δ
backend 65.98% <98.91%> (+0.99%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
core 84.29% <ø> (ø)
engine 75.71% <ø> (ø)
api 84.73% <ø> (ø)
nodes 67.41% <ø> (ø)
server 59.83% <98.86%> (+2.66%) ⬆️
plugin-native 72.67% <99.20%> (+1.73%) ⬆️
plugin-wasm 6.37% <ø> (ø)
ui-services 74.73% <ø> (ø)
ui-components 60.49% <ø> (ø)
Files with missing lines Coverage Δ
apps/skit/src/auth/handlers.rs 92.47% <100.00%> (+51.70%) ⬆️
crates/plugin-native/src/metrics.rs 100.00% <100.00%> (+11.76%) ⬆️
crates/plugin-native/src/lib.rs 82.24% <98.63%> (+8.35%) ⬆️
apps/skit/src/websocket.rs 87.17% <96.32%> (+28.88%) ⬆️

... and 3 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant