fix(otlp): propagate _dd.p.tid from chunk root to all spans#2014
fix(otlp): propagate _dd.p.tid from chunk root to all spans#2014zacharycmontoya wants to merge 1 commit into
Conversation
DD tracers set `_dd.p.tid` (high 64 bits of 128-bit trace ID) only on the chunk root per RFC #85 — the Datadog backend reconstructs the full 128-bit ID at ingest. The OTLP mapper previously read the tag per span, so children landed with upper 64 bits zeroed and traces fragmented in pure-OTel backends. Resolve the chunk-level `_dd.p.tid` once in `map_traces_to_otlp` and apply it to every span. Per-span value still wins (forward-compat with tracers that propagate everywhere). Use `find_map` over the chunk so a malformed root tag falls back to the first parseable value in the chunk rather than poisoning the whole trace. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
📚 Documentation Check Results📦
|
Clippy Allow Annotation ReportComparing clippy allow annotations between branches:
Summary by Rule
Annotation Counts by File
Annotation Stats by Crate
About This ReportThis report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality. |
🔒 Cargo Deny Results📦
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2014 +/- ##
==========================================
+ Coverage 72.65% 72.78% +0.12%
==========================================
Files 453 453
Lines 74934 75131 +197
==========================================
+ Hits 54441 54681 +240
+ Misses 20493 20450 -43
🚀 New features to boost your workflow:
|
🎉 All green!🧪 All tests passed 🎯 Code Coverage (details) 🔗 Commit SHA: 5e0dc6f | Docs | Datadog PR Page | Give us feedback! |
Artifact Size Benchmark Reportaarch64-alpine-linux-musl
aarch64-unknown-linux-gnu
libdatadog-x64-windows
libdatadog-x86-windows
x86_64-alpine-linux-musl
x86_64-unknown-linux-gnu
|
| // Per RFC #85 the high 64 bits of a 128-bit trace ID live on the chunk root as the | ||
| // `_dd.p.tid` meta tag. Scan the chunk for the first parseable value so a malformed | ||
| // root tag doesn't poison the rest of the chunk; absence (legacy 64-bit traces) → 0. | ||
| let chunk_trace_id_high: u64 = chunk | ||
| .iter() | ||
| .find_map(|s| { | ||
| s.meta | ||
| .get("_dd.p.tid") | ||
| .and_then(|v| u64::from_str_radix(v.borrow(), 16).ok()) | ||
| }) | ||
| .unwrap_or(0); |
There was a problem hiding this comment.
the Span type does represent trace ids as 128 bit ints (the highest 64 bits are discarded at serialization to v04). This is useful for instance in python where (once we release native spans) the trace id field will hold the full trace id
I'd prefer if you checked first that these are not zero before scanning through meta, for efficiency reason
There was a problem hiding this comment.
Ah okay. Are you suggesting I check only the first span's trace ID field to see if it's 128 bit or add a trace ID lookup in the iteration (for each span) before checking the meta dictionary?
Also, once I have this value, do you recommend passing the full 128-bit trace ID to map_span or pas only the upper 64-bits like I have right now?
| let trace_id_high: u128 = span | ||
| .meta | ||
| .get("_dd.p.tid") | ||
| .and_then(|v| u64::from_str_radix(v.borrow(), 16).ok()) | ||
| .unwrap_or(0) as u128; | ||
| .unwrap_or(chunk_trace_id_high) as u128; |
There was a problem hiding this comment.
Is this still necessary given that we do have the value passed as argument and all spans in a chunk should be of the same trace?
There was a problem hiding this comment.
Hmm. I think you're right, we don't need this lookup anymore. Removing this would only break scenarios where spans in the same chunk had different values for the _dd.p.tid span tag, which seems like an invalid scenario. I can remove this.
What does this PR do?
This PR updates the conversion of DD spans to OTLP spans to ensure all spans receive the high 64 bits of the 128-bit trace ID.
DD tracers set
_dd.p.tid(high 64 bits of 128-bit trace ID) only on the chunk root per RFC #85 — the Datadog backend reconstructs the full 128-bit ID at ingest. The OTLP mapper previously only read the tag per span, so child spans were emitted with upper 64 bits zeroed and traces fragmented in pure-OTel backends.The approach determines the chunk-level
_dd.p.tidonce inmap_traces_to_otlpand applies it to every span in the chunk. Per-span value still wins (forward-compat with tracers that propagate everywhere). Usefind_mapover the chunk so a malformed root tag falls back to the first parseable value in the chunk rather than poisoning the whole trace.Motivation
Performing end-to-end tests with dd-trace-py exporting OTLP spans surfaced this issue, which results in child spans recording different trace IDs than the local root spans. Notably, the only difference was that the high 64-bits of the 128-bit trace ID were zero'ed out.
Additional Notes
A system-test to cover this scenario is being simultaneously added in DataDog/system-tests#6973
How to test the change?
Regression tests have been added to
libdd-trace-utils/src/otlp_encoder/mapper.rs.