feat(data-pipeline): OTLP HTTP/protobuf trace export#2115
Open
bm1549 wants to merge 17 commits into
Open
Conversation
Records the approved design: vendor OTLP trace + collector protos and generate prost types (zero new runtime deps), keep the hand-rolled serde JSON path, share one mapper with a serde->prost converter, and select the protocol via builder + C FFI. Includes the dd-trace-py companion wiring and the layered E2E plan (local receiver, system-tests, sdk-backend-verify). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Bite-sized, TDD-structured plan across 9 phases: vendor+generate prost types, serde->prost converter, encoder dispatch, protocol config through builder + C FFI, full validation gauntlet + libdatadog PR, then dd-trace-py PyO3/writer wiring and three E2E tiers (local receiver, system-tests, sdk-backend-verify) + dd-trace-py PR. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Vendors opentelemetry/proto/trace/v1/trace.proto and opentelemetry/proto/collector/trace/v1/trace_service.proto from open-telemetry/opentelemetry-proto commit 1e725b853bc8f6b46ee62e8232e4c83017b9536f (matching the already-vendored common.proto and resource.proto). Adds both protos to the prost_build compile list in build.rs, generates the committed Rust types (opentelemetry.proto.trace.v1.rs and opentelemetry.proto.collector.trace.v1.rs), and updates _includes.rs. Also qualifies "Span" → "pb.Span" / "pb.idx.Span" in build.rs type_attribute calls to prevent serde derives from leaking into the new opentelemetry::proto::trace::v1::Span type.
…ests Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…-type match Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…o avoid doctest failures
… Grpc content-type arm
Contributor
🎉 All green!🧪 All tests passed 🎯 Code Coverage (details) 🔗 Commit SHA: a8a305f | Docs | Datadog PR Page | Give us feedback! |
Contributor
📚 Documentation Check Results📦
|
Contributor
Clippy Allow Annotation ReportComparing clippy allow annotations between branches:
Summary by Rule
Annotation Counts by File
Annotation Stats by Crate
About This ReportThis report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality. |
Contributor
🔒 Cargo Deny Results📦
|
ee71538 to
664f16f
Compare
Contributor
Artifact Size Benchmark Reportaarch64-alpine-linux-musl
aarch64-unknown-linux-gnu
libdatadog-x64-windows
libdatadog-x86-windows
x86_64-alpine-linux-musl
x86_64-unknown-linux-gnu
|
The OTLP design spec and implementation plan are linked from the PR description (internal chonk) rather than committed to the repo. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Selecting OtlpProtocol::Grpc on the Rust builder previously built a working-looking exporter that then failed on every send with a mis-typed Internal(InvalidWorkerState) error. Reject it in build_async (covering sync build + wasm) with BuilderErrorKind::InvalidConfiguration (FFI: InvalidArgument), matching the C FFI set_otlp_protocol setter so both entry points fail fast and identically. The send-time arm stays as a defensive guard. Also replace the "skip as instructed" placeholder comment in the OTLP serde->prost converter with a real test exercising link()/event() conversion (link trace/span ID byte assembly, tracestate, and link/event attributes). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Adds OTLP HTTP/protobuf as a second trace-export encoding alongside the existing HTTP/JSON path. The export protocol is selectable via the OTel-standard values
http/json(default) andhttp/protobuf, through the Rust builder methodTraceExporterBuilder::set_otlp_protocoland a new C FFI setterddog_trace_exporter_config_set_otlp_protocol.Motivation
libdatadog's OTLP trace export only spoke HTTP/JSON. SDKs that honor
OTEL_EXPORTER_OTLP_TRACES_PROTOCOLneedhttp/protobufto match the OTel default and to talk to collectors that expect protobuf.Additional Notes
traceandcollector/traceprotos (from the same opentelemetry-proto commit as the already-vendored common/resource protos) and generates prost types inlibdd-trace-protobuf. No new runtime dependency is added.From<&serde> for prostconverter produces the binary payload, and a parity test asserts both encodings carry the same span.flags. Carrying span-link trace-flags through both encoders is a separate follow-up (it adds a field to the publicOtlpSpanLink, a breaking change).grpcparses but is rejected at export time (not supported yet).How to test the change?
cargo test -p libdd-trace-utils -p libdd-data-pipeline -p libdd-data-pipeline-fficovers the unit and integration tests, including a protobuf round-trip test that decodes the emitted body withExportTraceServiceRequest::decodeand checks theapplication/x-protobufcontent-type.cargo +stable clippy --workspace --all-targets --all-features -- -D warnings,cargo +nightly-2026-02-08 fmt --all -- --check,cargo test --doc, andcargo ffi-testall pass.End-to-end validation through dd-trace-py
Validated the full chain — dd-trace-py builds the OTLP protobuf body via this change, ships it to a local Datadog Agent's OTLP intake, and the Agent forwards to the Datadog backend.
dd-trace-py consumer PR: DataDog/dd-trace-py#18609 · system-tests coverage: DataDog/system-tests#7131
Wire format (tcpdump on the OTLP intake):
http/protobuf→Content-Type: application/x-protobuf, zero JSON bodies, all requests answeredHTTP/1.1 200.http/jsonregression →Content-Type: application/json,HTTP/1.1 200. Protocol selection works both directions.Backend ingestion (fresh protobuf-only service so the spans can only have arrived via the protobuf encoder), counts by
resource_name:GET /protobuf-aGET /protobuf-bSELECT 1Span fidelity — a sampled span carried the correct
service/resource, the OTLP→Datadog mapping (custom.otel.*,deployment.environment.name), and a preserved 128-bit trace id (6a2c7287000000006ce19688cf3536b6— 32 hex chars, upper 64 bits carrying_dd.p.tid).