Skip to content

fix(otlp): propagate _dd.p.tid from chunk root to all spans#2014

Open
zacharycmontoya wants to merge 1 commit into
mainfrom
zach.montoya/otlp-propagate-dd-p-tid
Open

fix(otlp): propagate _dd.p.tid from chunk root to all spans#2014
zacharycmontoya wants to merge 1 commit into
mainfrom
zach.montoya/otlp-propagate-dd-p-tid

Conversation

@zacharycmontoya
Copy link
Copy Markdown

@zacharycmontoya zacharycmontoya commented May 19, 2026

What does this PR do?

This PR updates the conversion of DD spans to OTLP spans to ensure all spans receive the high 64 bits of the 128-bit trace ID.

DD tracers set _dd.p.tid (high 64 bits of 128-bit trace ID) only on the chunk root per RFC #85 — the Datadog backend reconstructs the full 128-bit ID at ingest. The OTLP mapper previously only read the tag per span, so child spans were emitted with upper 64 bits zeroed and traces fragmented in pure-OTel backends.

The approach determines the chunk-level _dd.p.tid once in map_traces_to_otlp and applies it to every span in the chunk. Per-span value still wins (forward-compat with tracers that propagate everywhere). Use find_map over the chunk so a malformed root tag falls back to the first parseable value in the chunk rather than poisoning the whole trace.

Motivation

Performing end-to-end tests with dd-trace-py exporting OTLP spans surfaced this issue, which results in child spans recording different trace IDs than the local root spans. Notably, the only difference was that the high 64-bits of the 128-bit trace ID were zero'ed out.

Additional Notes

A system-test to cover this scenario is being simultaneously added in DataDog/system-tests#6973

How to test the change?

Regression tests have been added to libdd-trace-utils/src/otlp_encoder/mapper.rs.

DD tracers set `_dd.p.tid` (high 64 bits of 128-bit trace ID) only on
the chunk root per RFC #85 — the Datadog backend reconstructs the full
128-bit ID at ingest. The OTLP mapper previously read the tag per span,
so children landed with upper 64 bits zeroed and traces fragmented in
pure-OTel backends.

Resolve the chunk-level `_dd.p.tid` once in `map_traces_to_otlp` and
apply it to every span. Per-span value still wins (forward-compat with
tracers that propagate everywhere). Use `find_map` over the chunk so a
malformed root tag falls back to the first parseable value in the chunk
rather than poisoning the whole trace.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@zacharycmontoya zacharycmontoya requested a review from a team as a code owner May 19, 2026 21:22
@zacharycmontoya zacharycmontoya requested review from mabdinur and removed request for a team May 19, 2026 21:22
@github-actions
Copy link
Copy Markdown
Contributor

📚 Documentation Check Results

⚠️ 568 documentation warning(s) found

📦 libdd-trace-utils - 568 warning(s)


Updated: 2026-05-19 21:24:33 UTC | Commit: 7931a7b | missing-docs job results

@github-actions
Copy link
Copy Markdown
Contributor

Clippy Allow Annotation Report

Comparing clippy allow annotations between branches:

  • Base Branch: origin/main
  • PR Branch: origin/zach.montoya/otlp-propagate-dd-p-tid

Summary by Rule

Rule Base Branch PR Branch Change

Annotation Counts by File

File Base Branch PR Branch Change

Annotation Stats by Crate

Crate Base Branch PR Branch Change
clippy-annotation-reporter 5 5 No change (0%)
datadog-ffe-ffi 1 1 No change (0%)
datadog-ipc 21 21 No change (0%)
datadog-live-debugger 6 6 No change (0%)
datadog-live-debugger-ffi 10 10 No change (0%)
datadog-profiling-replayer 4 4 No change (0%)
datadog-remote-config 3 3 No change (0%)
datadog-sidecar 57 57 No change (0%)
libdd-common 13 13 No change (0%)
libdd-common-ffi 12 12 No change (0%)
libdd-data-pipeline 5 5 No change (0%)
libdd-ddsketch 2 2 No change (0%)
libdd-dogstatsd-client 1 1 No change (0%)
libdd-profiling 13 13 No change (0%)
libdd-telemetry 20 20 No change (0%)
libdd-tinybytes 4 4 No change (0%)
libdd-trace-normalization 2 2 No change (0%)
libdd-trace-obfuscation 3 3 No change (0%)
libdd-trace-stats 1 1 No change (0%)
libdd-trace-utils 15 15 No change (0%)
Total 198 198 No change (0%)

About This Report

This report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality.

@github-actions
Copy link
Copy Markdown
Contributor

🔒 Cargo Deny Results

⚠️ 4 issue(s) found, showing only errors (advisories, bans, sources)

📦 libdd-trace-utils - 4 error(s)

Show output
error[unsound]: Rand is unsound with a custom logger using `rand::rng()`
    ┌─ /home/runner/work/libdatadog/libdatadog/Cargo.lock:177:1
    │
177 │ rand 0.8.5 registry+https://github.com/rust-lang/crates.io-index
    │ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ unsound advisory detected
    │
    ├ ID: RUSTSEC-2026-0097
    ├ Advisory: https://rustsec.org/advisories/RUSTSEC-2026-0097
    ├ It has been reported (by @lopopolo) that the `rand` library is [unsound](https://rust-lang.github.io/unsafe-code-guidelines/glossary.html#soundness-of-code--of-a-library) (i.e. that safe code using the public API can cause Undefined Behaviour) when all the following conditions are met:
      
      - The `log` and `thread_rng` features are enabled
      - A [custom logger](https://docs.rs/log/latest/log/#implementing-a-logger) is defined
      - The custom logger accesses `rand::rng()` (previously `rand::thread_rng()`) and calls any `TryRng` (previously `RngCore`) methods on `ThreadRng`
      - The `ThreadRng` (attempts to) reseed while called from the custom logger (this happens every 64 kB of generated data)
      - Trace-level logging is enabled or warn-level logging is enabled and the random source (the `getrandom` crate) is unable to provide a new seed
      
      `TryRng` (previously `RngCore`) methods for `ThreadRng` use `unsafe` code to cast `*mut BlockRng<ReseedingCore>` to `&mut BlockRng<ReseedingCore>`. When all the above conditions are met this results in an aliased mutable reference, violating the Stacked Borrows rules. Miri is able to detect this violation in sample code. Since construction of [aliased mutable references is Undefined Behaviour](https://doc.rust-lang.org/stable/nomicon/references.html), the behaviour of optimized builds is hard to predict.
    ├ Announcement: https://github.com/rust-random/rand/pull/1763
    ├ Solution: Upgrade to >=0.10.1 OR <0.10.0, >=0.9.3 OR <0.9.0, >=0.8.6 (try `cargo update -p rand`)
    ├ rand v0.8.5
      ├── (dev) libdd-common v4.1.0
      │   ├── libdd-capabilities-impl v2.0.0
      │   │   └── libdd-trace-utils v4.0.0
      │   │       └── (dev) libdd-trace-utils v4.0.0 (*)
      │   └── libdd-trace-utils v4.0.0 (*)
      ├── (dev) libdd-trace-normalization v2.0.0
      │   └── libdd-trace-utils v4.0.0 (*)
      ├── libdd-trace-utils v4.0.0 (*)
      └── proptest v1.5.0
          └── (dev) libdd-tinybytes v1.1.1
              ├── (dev) libdd-tinybytes v1.1.1 (*)
              └── libdd-trace-utils v4.0.0 (*)

error[vulnerability]: Name constraints for URI names were incorrectly accepted
    ┌─ /home/runner/work/libdatadog/libdatadog/Cargo.lock:199:1
    │
199 │ rustls-webpki 0.103.10 registry+https://github.com/rust-lang/crates.io-index
    │ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ security vulnerability detected
    │
    ├ ID: RUSTSEC-2026-0098
    ├ Advisory: https://rustsec.org/advisories/RUSTSEC-2026-0098
    ├ Name constraints for URI names were ignored and therefore accepted.
      
      Note this library does not provide an API for asserting URI names, and URI name constraints are otherwise not implemented.  URI name constraints are now rejected unconditionally.
      
      Since name constraints are restrictions on otherwise properly-issued certificates, this bug is reachable only after signature verification and requires misissuance to exploit.
      
      This vulnerability is identified as [GHSA-965h-392x-2mh5](https://github.com/rustls/webpki/security/advisories/GHSA-965h-392x-2mh5). Thank you to @1seal for the report.
    ├ Solution: Upgrade to >=0.103.12, <0.104.0-alpha.1 OR >=0.104.0-alpha.6 (try `cargo update -p rustls-webpki`)
    ├ rustls-webpki v0.103.10
      └── rustls v0.23.37
          ├── hyper-rustls v0.27.7
          │   └── libdd-common v4.1.0
          │       ├── libdd-capabilities-impl v2.0.0
          │       │   └── libdd-trace-utils v4.0.0
          │       │       └── (dev) libdd-trace-utils v4.0.0 (*)
          │       └── libdd-trace-utils v4.0.0 (*)
          ├── libdd-common v4.1.0 (*)
          └── tokio-rustls v0.26.0
              ├── hyper-rustls v0.27.7 (*)
              └── libdd-common v4.1.0 (*)

error[vulnerability]: Name constraints were accepted for certificates asserting a wildcard name
    ┌─ /home/runner/work/libdatadog/libdatadog/Cargo.lock:199:1
    │
199 │ rustls-webpki 0.103.10 registry+https://github.com/rust-lang/crates.io-index
    │ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ security vulnerability detected
    │
    ├ ID: RUSTSEC-2026-0099
    ├ Advisory: https://rustsec.org/advisories/RUSTSEC-2026-0099
    ├ Permitted subtree name constraints for DNS names were accepted for certificates asserting a wildcard name.
      
      This was incorrect because, given a name constraint of `accept.example.com`, `*.example.com` could feasibly allow a name of `reject.example.com` which is outside the constraint.
      This is very similar to [CVE-2025-61727](https://go.dev/issue/76442).
      
      Since name constraints are restrictions on otherwise properly-issued certificates, this bug is reachable only after signature verification and requires misissuance to exploit.
      
      This vulnerability is identified as [GHSA-xgp8-3hg3-c2mh](https://github.com/rustls/webpki/security/advisories/GHSA-xgp8-3hg3-c2mh). Thank you to @1seal for the report.
    ├ Solution: Upgrade to >=0.103.12, <0.104.0-alpha.1 OR >=0.104.0-alpha.6 (try `cargo update -p rustls-webpki`)
    ├ rustls-webpki v0.103.10
      └── rustls v0.23.37
          ├── hyper-rustls v0.27.7
          │   └── libdd-common v4.1.0
          │       ├── libdd-capabilities-impl v2.0.0
          │       │   └── libdd-trace-utils v4.0.0
          │       │       └── (dev) libdd-trace-utils v4.0.0 (*)
          │       └── libdd-trace-utils v4.0.0 (*)
          ├── libdd-common v4.1.0 (*)
          └── tokio-rustls v0.26.0
              ├── hyper-rustls v0.27.7 (*)
              └── libdd-common v4.1.0 (*)

error[vulnerability]: Reachable panic in certificate revocation list parsing
    ┌─ /home/runner/work/libdatadog/libdatadog/Cargo.lock:199:1
    │
199 │ rustls-webpki 0.103.10 registry+https://github.com/rust-lang/crates.io-index
    │ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ security vulnerability detected
    │
    ├ ID: RUSTSEC-2026-0104
    ├ Advisory: https://rustsec.org/advisories/RUSTSEC-2026-0104
    ├ A panic was reachable when parsing certificate revocation lists via [`BorrowedCertRevocationList::from_der`]
      or [`OwnedCertRevocationList::from_der`].  This was the result of mishandling a syntactically valid empty
      `BIT STRING` appearing in the `onlySomeReasons` element of a `IssuingDistributionPoint` CRL extension.
      
      This panic is reachable prior to a CRL's signature being verified.
      
      Applications that do not use CRLs are not affected.
      
      Thank you to @tynus3 for the report.
    ├ Solution: Upgrade to >=0.103.13, <0.104.0-alpha.1 OR >=0.104.0-alpha.7 (try `cargo update -p rustls-webpki`)
    ├ rustls-webpki v0.103.10
      └── rustls v0.23.37
          ├── hyper-rustls v0.27.7
          │   └── libdd-common v4.1.0
          │       ├── libdd-capabilities-impl v2.0.0
          │       │   └── libdd-trace-utils v4.0.0
          │       │       └── (dev) libdd-trace-utils v4.0.0 (*)
          │       └── libdd-trace-utils v4.0.0 (*)
          ├── libdd-common v4.1.0 (*)
          └── tokio-rustls v0.26.0
              ├── hyper-rustls v0.27.7 (*)
              └── libdd-common v4.1.0 (*)

advisories FAILED, bans ok, sources ok

Updated: 2026-05-19 21:26:26 UTC | Commit: 7931a7b | dependency-check job results

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 72.78%. Comparing base (cea1e44) to head (5e0dc6f).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2014      +/-   ##
==========================================
+ Coverage   72.65%   72.78%   +0.12%     
==========================================
  Files         453      453              
  Lines       74934    75131     +197     
==========================================
+ Hits        54441    54681     +240     
+ Misses      20493    20450      -43     
Components Coverage Δ
libdd-crashtracker 65.31% <ø> (ø)
libdd-crashtracker-ffi 37.56% <ø> (ø)
libdd-alloc 98.77% <ø> (ø)
libdd-data-pipeline 85.91% <ø> (ø)
libdd-data-pipeline-ffi 73.93% <ø> (ø)
libdd-common 79.81% <ø> (ø)
libdd-common-ffi 74.41% <ø> (ø)
libdd-telemetry 73.34% <ø> (ø)
libdd-telemetry-ffi 31.36% <ø> (ø)
libdd-dogstatsd-client 82.64% <ø> (ø)
datadog-ipc 76.22% <ø> (+1.46%) ⬆️
libdd-profiling 81.69% <ø> (-0.02%) ⬇️
libdd-profiling-ffi 64.79% <ø> (ø)
libdd-sampling 97.46% <ø> (ø)
datadog-sidecar 28.87% <ø> (ø)
datdog-sidecar-ffi 8.56% <ø> (ø)
spawn-worker 48.86% <ø> (ø)
libdd-tinybytes 93.16% <ø> (ø)
libdd-trace-normalization 81.71% <ø> (ø)
libdd-trace-obfuscation 87.30% <ø> (ø)
libdd-trace-protobuf 68.25% <ø> (ø)
libdd-trace-utils 89.87% <100.00%> (+0.28%) ⬆️
libdd-tracer-flare 86.88% <ø> (ø)
libdd-log 74.83% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@datadog-prod-us1-3
Copy link
Copy Markdown

datadog-prod-us1-3 Bot commented May 19, 2026

Tests

🎉 All green!

🧪 All tests passed
❄️ No new flaky tests detected

🎯 Code Coverage (details)
Patch Coverage: 100.00%
Overall Coverage: 72.78% (+0.13%)

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 5e0dc6f | Docs | Datadog PR Page | Give us feedback!

@dd-octo-sts
Copy link
Copy Markdown
Contributor

dd-octo-sts Bot commented May 19, 2026

Artifact Size Benchmark Report

aarch64-alpine-linux-musl
Artifact Baseline Commit Change
/aarch64-alpine-linux-musl/lib/libdatadog_profiling.so 7.57 MB 7.57 MB 0% (0 B) 👌
/aarch64-alpine-linux-musl/lib/libdatadog_profiling.a 81.76 MB 81.76 MB +0% (+2.09 KB) 👌
aarch64-unknown-linux-gnu
Artifact Baseline Commit Change
/aarch64-unknown-linux-gnu/lib/libdatadog_profiling.so 10.01 MB 10.01 MB 0% (0 B) 👌
/aarch64-unknown-linux-gnu/lib/libdatadog_profiling.a 97.95 MB 97.95 MB +0% (+2.92 KB) 👌
libdatadog-x64-windows
Artifact Baseline Commit Change
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.dll 24.47 MB 24.47 MB +0% (+512 B) 👌
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.lib 79.87 KB 79.87 KB 0% (0 B) 👌
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.pdb 180.01 MB 180.05 MB +.01% (+32.00 KB) 🔍
/libdatadog-x64-windows/debug/static/datadog_profiling_ffi.lib 913.27 MB 913.28 MB +0% (+11.22 KB) 👌
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.dll 7.73 MB 7.73 MB 0% (0 B) 👌
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.lib 79.87 KB 79.87 KB 0% (0 B) 👌
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.pdb 23.15 MB 23.15 MB 0% (0 B) 👌
/libdatadog-x64-windows/release/static/datadog_profiling_ffi.lib 45.34 MB 45.34 MB +0% (+2.23 KB) 👌
libdatadog-x86-windows
Artifact Baseline Commit Change
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.dll 21.09 MB 21.09 MB +0% (+1.50 KB) 👌
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.lib 81.11 KB 81.11 KB 0% (0 B) 👌
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.pdb 184.21 MB 184.24 MB +.01% (+32.00 KB) 🔍
/libdatadog-x86-windows/debug/static/datadog_profiling_ffi.lib 898.97 MB 898.98 MB +0% (+11.54 KB) 👌
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.dll 5.99 MB 5.99 MB +0% (+512 B) 👌
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.lib 81.11 KB 81.11 KB 0% (0 B) 👌
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.pdb 24.80 MB 24.80 MB +.03% (+8.00 KB) 🔍
/libdatadog-x86-windows/release/static/datadog_profiling_ffi.lib 42.85 MB 42.85 MB +0% (+2.32 KB) 👌
x86_64-alpine-linux-musl
Artifact Baseline Commit Change
/x86_64-alpine-linux-musl/lib/libdatadog_profiling.a 72.87 MB 72.87 MB +0% (+4.71 KB) 👌
/x86_64-alpine-linux-musl/lib/libdatadog_profiling.so 8.42 MB 8.42 MB 0% (0 B) 👌
x86_64-unknown-linux-gnu
Artifact Baseline Commit Change
/x86_64-unknown-linux-gnu/lib/libdatadog_profiling.a 90.63 MB 90.63 MB +0% (+4.93 KB) 👌
/x86_64-unknown-linux-gnu/lib/libdatadog_profiling.so 10.06 MB 10.06 MB 0% (0 B) 👌

Comment on lines +36 to +46
// Per RFC #85 the high 64 bits of a 128-bit trace ID live on the chunk root as the
// `_dd.p.tid` meta tag. Scan the chunk for the first parseable value so a malformed
// root tag doesn't poison the rest of the chunk; absence (legacy 64-bit traces) → 0.
let chunk_trace_id_high: u64 = chunk
.iter()
.find_map(|s| {
s.meta
.get("_dd.p.tid")
.and_then(|v| u64::from_str_radix(v.borrow(), 16).ok())
})
.unwrap_or(0);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the Span type does represent trace ids as 128 bit ints (the highest 64 bits are discarded at serialization to v04). This is useful for instance in python where (once we release native spans) the trace id field will hold the full trace id
I'd prefer if you checked first that these are not zero before scanning through meta, for efficiency reason

Copy link
Copy Markdown
Author

@zacharycmontoya zacharycmontoya May 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay. Are you suggesting I check only the first span's trace ID field to see if it's 128 bit or add a trace ID lookup in the iteration (for each span) before checking the meta dictionary?

Also, once I have this value, do you recommend passing the full 128-bit trace ID to map_span or pas only the upper 64-bits like I have right now?

Comment on lines 119 to +123
let trace_id_high: u128 = span
.meta
.get("_dd.p.tid")
.and_then(|v| u64::from_str_radix(v.borrow(), 16).ok())
.unwrap_or(0) as u128;
.unwrap_or(chunk_trace_id_high) as u128;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still necessary given that we do have the value passed as argument and all spans in a chunk should be of the same trace?

Copy link
Copy Markdown
Author

@zacharycmontoya zacharycmontoya May 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I think you're right, we don't need this lookup anymore. Removing this would only break scenarios where spans in the same chunk had different values for the _dd.p.tid span tag, which seems like an invalid scenario. I can remove this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants