feat(data-pipeline): export client-computed span stats as OTLP trace metrics#2067
Draft
mabdinur wants to merge 9 commits into
Draft
feat(data-pipeline): export client-computed span stats as OTLP trace metrics#2067mabdinur wants to merge 9 commits into
mabdinur wants to merge 9 commits into
Conversation
…metrics Add an OTLP HTTP/JSON trace-metrics export path so client-computed span stats can be shipped as the `dd.trace.span.duration` histogram to an OTLP `/v1/metrics` endpoint instead of the Datadog agent `/v0.6/stats` endpoint. - libdd-ddsketch: reconstruct a DDSketch from its protobuf (`from_pb` / `from_encoded`) so histogram buckets can be rebuilt from the per-group summaries (Approach A: count + explicit bounds/bucket counts from the sketch bins, sum approximated from bin value*weight). - libdd-data-pipeline: add `OtlpMetricsConfig`, `send_otlp_metrics_http`, the `map_stats_to_otlp_metrics` encoder, and an `OtlpStatsExporter` worker that periodically flushes the concentrator and exports metrics. - TraceExporterBuilder: add `set_otlp_metrics_endpoint`/`set_otlp_metrics_headers`. When set together with `enable_stats`, the span concentrator is started unconditionally (bypassing the agent gate) and `check_agent_info` no longer toggles stats in this mode. Co-authored-by: Cursor <cursoragent@cursor.com>
Contributor
📚 Documentation Check Results📦
|
Contributor
Clippy Allow Annotation ReportComparing clippy allow annotations between branches:
Summary by Rule
Annotation Counts by File
Annotation Stats by Crate
About This ReportThis report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality. |
|
Contributor
🔒 Cargo Deny Results📦
|
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2067 +/- ##
==========================================
+ Coverage 73.42% 73.46% +0.04%
==========================================
Files 465 466 +1
Lines 77949 78552 +603
==========================================
+ Hits 57231 57706 +475
- Misses 20718 20846 +128
🚀 New features to boost your workflow:
|
…ndtrip test under Miri Apply rustfmt formatting to the OTLP trace-metrics module and reorder a re-export to fix the failing Lint (rustfmt) CI job. In test_sketch_pb_roundtrip, compare bin values with a relative tolerance instead of exact equality. Bin values are derived via exp() in LogMapping::value, and Miri intentionally perturbs the last ULPs of transcendental float ops, which made the exact assert_eq! fail under Miri. Bin counts are still compared exactly. Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com> # Conflicts: # libdd-data-pipeline/src/otlp/exporter.rs
Co-authored-by: Cursor <cursoragent@cursor.com> # Conflicts: # libdd-data-pipeline/src/trace_exporter/builder.rs # libdd-data-pipeline/src/trace_exporter/mod.rs
Apply rustfmt formatting to test code introduced during the origin/main merge so the Lint (rustfmt) CI job passes. Co-authored-by: Cursor <cursoragent@cursor.com>
Contributor
Artifact Size Benchmark Reportaarch64-alpine-linux-musl
aarch64-unknown-linux-gnu
libdatadog-x64-windows
libdatadog-x86-windows
x86_64-alpine-linux-musl
x86_64-unknown-linux-gnu
|
Rename the SDK-computed span metric to traces.span.sdk.metrics.duration (unit s), emit a single-source status.code for errors, min/max on each data point, and the dd.* attribute family (dd.operation.name, dd.span.type, dd.span.top_level, dd.origin). Add an enable_otel_trace_semantics() builder toggle that, when set, emits only OpenTelemetry attributes. Add host.name and dd.<key> process-tag resource attributes, plus a grpc_method stats field mapped to rpc.method. Co-authored-by: Cursor <cursoragent@cursor.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Adds native OTLP trace-metrics export to
libdd-data-pipeline. When an OTLP metrics endpoint isconfigured on the
TraceExporter, the span concentrator's stats are mapped into add.trace.span.durationOTLP histogram and POSTed to the configured/v1/metricsendpoint overHTTP/JSON.
otlp/metrics.rs: OTLP metrics serde types,map_stats_to_otlp_metrics(DDSketch -> explicitbucket histogram, delta temporality), and the
OtlpStatsExporterbackground worker.otlp/config.rs:OtlpMetricsConfig.otlp/exporter.rs: sharedsend_otlp_httphelper +send_otlp_metrics_http.TraceExporterBuilder:set_otlp_metrics_endpoint/set_otlp_metrics_headers; when set, thespan concentrator is started and the OTLP stats worker is spawned.
libdd-ddsketch:DDSketch::from_pb/from_encodedto rebuild a sketch from its protobuf.Motivation
Provide OTLP trace-metrics export as a reusable
libdatadogcapability so tracers only supplyconfiguration, building on the existing OTLP export path.
Additional Notes
For reviewers:
count,explicit_bounds, andbucket_countsare derived from theper-group ok/error DDSketch summaries;
sumis approximated from the sketch bins. No new per-cellaccumulators. Follow-ups documented in
otlp/metrics.rs: exact per-cellsum,top_level_hits,gRPC/protobuf transport.
check_agent_infono longer (de)activates stats based onagent info (
otlp_stats_enabledflag), so export works without agent stats support./v1/metrics); invalid headers are skipped with a warning.run()export works as expected. The force-flush in the worker'sshutdown()issues its HTTP request from within the exporter's boundedblock_on(timeout, ...)shutdown path, where the spawned hyper connection task is not reliably driven, so final-bucket data
may be dropped on shutdown. Draining the last bucket on shutdown is a follow-up.
Draft: consumed by the dd-trace-py integration/testing PR (DataDog/dd-trace-py#18354).
How to test the change?
cargo test -p libdd-data-pipeline --lib otlp::metricsand thelibdd-ddsketchprotobuf roundtriptest.
cargo clippy -p libdd-data-pipeline -p libdd-ddsketch --all-featuresis clean.dd.trace.span.durationOTLPhistogram to a mock
/v1/metricsendpoint.