[ffe] Activate test_flag_eval_metrics.py across released tracer versions#6972
Open
sameerank wants to merge 5 commits into
Open
[ffe] Activate test_flag_eval_metrics.py across released tracer versions#6972sameerank wants to merge 5 commits into
sameerank wants to merge 5 commits into
Conversation
Ruby v2.32.0, Python v4.7.0, Go v2.8.0, Node.js v5.99.0 (express4 only), .NET v3.44.0, and Java v1.62.0 now ship FFE evaluation metrics. Java keeps the per-class FFL-1972 overrides so CI XPASSes will flag the tests that can be trimmed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
|
|
🎉 All green!🧪 All tests passed 🔗 Commit SHA: dce01cb | Docs | Datadog PR Page | Give us feedback! |
dd-trace-js v5.103.0 fails to distinguish feature_flag.result.reason for static/split/type-mismatch paths (all report targeting_match) and emits error.type=general instead of parse_error. Tracked in FFL-2313 under FFL-1899. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Java v1.62.0 actually shipped fixes for 16 of the 17 FFL-1972 tests (verified by CI XPASS on the prod spring-boot job). Only Test_FFE_Eval_Metric_Count still fails; the rest can run without overrides. Test_FFE_Eval_Targeting_Key_Optional for nodejs is more accurately a bug (FFL-1730) than irrelevant — the JS SDK errors on empty targeting key instead of treating it as optional. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
9579087 to
dc4ef75
Compare
sameerank
commented
May 20, 2026
Comment on lines
-3168
to
-3181
| tests/ffe/test_flag_eval_metrics.py::Test_FFE_Eval_Metric_Different_Flags::test_ffe_eval_metric_different_flags: bug (FFL-1972) | ||
| tests/ffe/test_flag_eval_metrics.py::Test_FFE_Eval_Metric_Numeric_To_Integer::test_ffe_eval_metric_numeric_to_integer: bug (FFL-1972) | ||
| ? tests/ffe/test_flag_eval_metrics.py::Test_FFE_Eval_Metric_Parse_Error_Invalid_Regex::test_ffe_eval_metric_parse_error_invalid_regex | ||
| : bug (FFL-1972) | ||
| ? tests/ffe/test_flag_eval_metrics.py::Test_FFE_Eval_Metric_Parse_Error_Variant_Type_Mismatch::test_ffe_eval_metric_parse_error_variant_type_mismatch | ||
| : bug (FFL-1972) | ||
| tests/ffe/test_flag_eval_metrics.py::Test_FFE_Eval_Metric_Type_Mismatch::test_ffe_eval_metric_type_mismatch: bug (FFL-1972) | ||
| tests/ffe/test_flag_eval_metrics.py::Test_FFE_Eval_Nested_Attributes_Ignored::test_ffe_eval_nested_attributes_ignored: bug (FFL-1972) | ||
| tests/ffe/test_flag_eval_metrics.py::Test_FFE_Eval_No_Config_Loaded::test_ffe_eval_no_config_loaded: bug (FFL-1972) | ||
| tests/ffe/test_flag_eval_metrics.py::Test_FFE_Eval_Reason_Default::test_ffe_eval_reason_default: bug (FFL-1972) | ||
| tests/ffe/test_flag_eval_metrics.py::Test_FFE_Eval_Reason_Disabled::test_ffe_eval_reason_disabled: bug (FFL-1972) | ||
| tests/ffe/test_flag_eval_metrics.py::Test_FFE_Eval_Reason_Split::test_ffe_eval_reason_split: bug (FFL-1972) | ||
| tests/ffe/test_flag_eval_metrics.py::Test_FFE_Eval_Reason_Targeting::test_ffe_eval_reason_targeting: bug (FFL-1972) | ||
| tests/ffe/test_flag_eval_metrics.py::Test_FFE_Eval_Targeting_Key_Optional::test_ffe_eval_targeting_key_optional: bug (FFL-1972) |
Contributor
Author
There was a problem hiding this comment.
They're passing the tests so I believe they are fixed!
Comment on lines
+1674
to
+1681
| tests/ffe/test_flag_eval_metrics.py::Test_FFE_Eval_Metric_Basic::test_ffe_eval_metric_basic: bug (FFL-2313) | ||
| tests/ffe/test_flag_eval_metrics.py::Test_FFE_Eval_Metric_Numeric_To_Integer::test_ffe_eval_metric_numeric_to_integer: bug (FFL-2313) | ||
| ? tests/ffe/test_flag_eval_metrics.py::Test_FFE_Eval_Metric_Parse_Error_Invalid_Regex::test_ffe_eval_metric_parse_error_invalid_regex | ||
| : bug (FFL-2313) | ||
| ? tests/ffe/test_flag_eval_metrics.py::Test_FFE_Eval_Metric_Parse_Error_Variant_Type_Mismatch::test_ffe_eval_metric_parse_error_variant_type_mismatch | ||
| : bug (FFL-2313) | ||
| tests/ffe/test_flag_eval_metrics.py::Test_FFE_Eval_Reason_Split::test_ffe_eval_reason_split: bug (FFL-2313) | ||
| tests/ffe/test_flag_eval_metrics.py::Test_FFE_Eval_Targeting_Key_Optional: bug (FFL-1730) |
Contributor
Author
There was a problem hiding this comment.
I think it's okay to skip these tests because they are actually concerned with a different "Standardized evaluation reasons" feature https://feature-parity.us1.prod.dog/?runDateFilter=7d&products=14&feature=552&language=2
The previous catch-all entry ran the tests on every Java weblog, which caused 404s on akka-http, jersey-grizzly2, play, ratpack, resteasy-netty3, spring-boot-3-native, vertx3, and vertx4 — none of which implement the /ffe endpoint. Matches the existing weblog_declaration pattern used by test_dynamic_evaluation.py and test_exposures.py.
Member
|
FFE tests were disabled yesterday due to an incident. Updating the branch to re-trigger the tests to make sure they are still passing |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
tests/ffe/test_flag_eval_metrics.pyis now supported in released versions of every tracer that implements FFE evaluation metrics. The manifests were still pinning these tests to-devversions ormissing_feature, so the tests don't run against the released artifacts even though the feature is shipped.Release tags:
Gettings the tests to pass for all of these releases let's us update https://feature-parity.us1.prod.dog/?runDateFilter=7d&products=14&feature=548
Changes
Version-gate activation
manifests/ruby.ymlrails72: v2.32.0-devrails72: v2.32.0manifests/python.ymlv4.8.0-devv4.7.0manifests/golang.ymlv2.9.0-devv2.8.0manifests/nodejs.ymlmissing_featureweblog_declarationwith"*": incomplete_test_app,express4: *ref_5_99_0manifests/dotnet.ymlmissing_feature (FFL-2257 ...)v3.44.0manifests/java.ymlmissing_feature (FFL-1972)weblog_declarationwith"*": irrelevant,spring-boot: v1.62.0Only
express4is activated on Node.js because the/ffeendpoint is only implemented in theexpresstest app; other weblogs remainincomplete_test_app, matching the existing pattern used bytest_dynamic_evaluation.pyandtest_exposures.py.Similarly, Java is scoped to the
spring-bootweblog only — non-spring-boot Java weblogs (akka-http, jersey-grizzly2, play, ratpack, resteasy-netty3, spring-boot-3-native, vertx3, vertx4) don't implement/ffe, matching the pattern used bytest_dynamic_evaluation.pyandtest_exposures.py.Java per-test overrides
Java v1.62.0 ships fixes for 16 of the 17 prior FFL-1972 failures — verified by XPASS on the Java prod spring-boot CI job (
logs_endtoend_java_spring-boot_prod_1artifact, FFE scenario). OnlyTest_FFE_Eval_Metric_Countstill fails:Node.js per-test overrides
dd-trace-js v5.103.0 has consistency issues in
feature_flag.result.reason/error.typetag derivation on thefeature_flag.evaluationsOTel metric. Tracked in FFL-2313 (subtask under FFL-1899):Test_FFE_Eval_Targeting_Key_Optionalwas re-tagged fromirrelevant (JS SDK requires targeting key)tobug (FFL-1730)— the JS SDK erroring on empty targeting key is a known bug, not a "test doesn't apply" case, and FFL-1730 already covers the root cause (also referenced bytest_exposures.py::Test_FFE_EXP_5_Missing_Targeting_Keyabove it in the same file).Workflow
🚀 Once your PR is reviewed and the CI green, you can merge it!
🛟 #apm-shared-testing 🛟
Reviewer checklist