fix(instrumentations): include cache tokens in gen ai.usage.input tokens#1027
fix(instrumentations): include cache tokens in gen ai.usage.input tokens#1027dvirski wants to merge 6 commits into
Conversation
📝 WalkthroughWalkthroughAdds cache token fold-in semantics across Anthropic, Bedrock, and LangChain instrumentations (folding ChangesCache Token Fold-In Across Instrumentation Packages
Sequence Diagram(s)sequenceDiagram
participant AIResponse as AI Provider Response
participant Instrumentation as Instrumentation Layer
participant Span as OpenTelemetry Span
AIResponse->>Instrumentation: response with input_tokens, cache_read_input_tokens, cache_creation_input_tokens
Instrumentation->>Instrumentation: Compute totalInputTokens = input_tokens + cache_read_input_tokens + cache_creation_input_tokens
Instrumentation->>Span: Set gen_ai.usage.input_tokens = totalInputTokens
Instrumentation->>Span: Set gen_ai.usage.total_tokens = totalInputTokens + output_tokens
Instrumentation->>Span: Set gen_ai.usage.output_tokens = output_tokens
alt cache_read_input_tokens present
Instrumentation->>Span: Set gen_ai.usage.cache.read.input_tokens = cache_read_input_tokens
end
alt cache_creation_input_tokens present
Instrumentation->>Span: Set gen_ai.usage.cache.creation.input_tokens = cache_creation_input_tokens
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint install failed: dependency version conflict. Check your lock file or package.json. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/instrumentation-together/src/instrumentation.ts`:
- Around line 525-533: The cachedTokens variable is cast to number without type
validation, which could cause issues if the Together API returns a non-numeric
value. Add a type guard to verify that cachedTokens is actually a number before
calling span.setAttribute with ATTR_GEN_AI_USAGE_CACHE_READ_INPUT_TOKENS. Check
if typeof cachedTokens === 'number' in addition to the existing truthiness
check, and only set the attribute if the value passes both validations.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1bc0adcd-18ed-4dc1-bbb4-6a1a91594454
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (15)
packages/instrumentation-anthropic/src/instrumentation.tspackages/instrumentation-anthropic/test/instrumentation.test.tspackages/instrumentation-bedrock/src/instrumentation.tspackages/instrumentation-bedrock/tests/anthropic.test.tspackages/instrumentation-bedrock/tests/cache-token-fold-in.test.tspackages/instrumentation-langchain/src/callback_handler.tspackages/instrumentation-langchain/test/cache-token-fold-in.test.tspackages/instrumentation-openai/src/instrumentation.tspackages/instrumentation-openai/test/instrumentation.test.tspackages/instrumentation-together/package.jsonpackages/instrumentation-together/src/instrumentation.tspackages/instrumentation-together/test/instrumentation.test.tspackages/instrumentation-vertexai/package.jsonpackages/instrumentation-vertexai/src/vertexai-instrumentation.tspackages/instrumentation-vertexai/tests/gemini.test.ts
| const cachedTokens = ( | ||
| result.usage as unknown as Record<string, unknown> | ||
| ).cached_tokens; | ||
| if (cachedTokens) { | ||
| span.setAttribute( | ||
| ATTR_GEN_AI_USAGE_CACHE_READ_INPUT_TOKENS, | ||
| cachedTokens as number, | ||
| ); | ||
| } |
There was a problem hiding this comment.
Validate type before casting to number.
The code casts cachedTokens to number without verifying it's actually a number. If the Together API returns a non-numeric value, this could cause downstream issues.
🛡️ Proposed fix to add type guard
const cachedTokens = (
result.usage as unknown as Record<string, unknown>
).cached_tokens;
-if (cachedTokens) {
+if (typeof cachedTokens === "number") {
span.setAttribute(
ATTR_GEN_AI_USAGE_CACHE_READ_INPUT_TOKENS,
- cachedTokens as number,
+ cachedTokens,
);
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/instrumentation-together/src/instrumentation.ts` around lines 525 -
533, The cachedTokens variable is cast to number without type validation, which
could cause issues if the Together API returns a non-numeric value. Add a type
guard to verify that cachedTokens is actually a number before calling
span.setAttribute with ATTR_GEN_AI_USAGE_CACHE_READ_INPUT_TOKENS. Check if
typeof cachedTokens === 'number' in addition to the existing truthiness check,
and only set the attribute if the value passes both validations.
max-deygin-traceloop
left a comment
There was a problem hiding this comment.
Looks alright, please look if cod rabbit's comment makes sense
And fix lint
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/instrumentation-langchain/src/callback_handler.ts (1)
356-373:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winGuard tokenUsage fallback against llmOutput.usage to prevent attribute overwrite.
The
!usageMetadataguard at line 356 correctly preventstokenUsagefrom running whenusage_metadataexists. However, it doesn't check whetherllmOutput.usage(lines 334-352) was already used. WhenusageMetadatais absent and bothllmOutput.usageandllmOutput.tokenUsageexist, both blocks will execute and lines 359, 362, and 367 will overwrite the attributes set by lines 337, 340, and 347-350.In practice,
llmOutput.usageandllmOutput.tokenUsagelikely represent mutually exclusive provider formats and may never coexist, but the code should be defensive.Add llmOutput.usage check to the tokenUsage guard
// Also check for tokenUsage format (for compatibility). // Skip when usage_metadata already populated the values. - if (!usageMetadata && output.llmOutput?.tokenUsage) { + if (!usageMetadata && !output.llmOutput?.usage && output.llmOutput?.tokenUsage) { const usage = output.llmOutput.tokenUsage;🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/instrumentation-langchain/src/callback_handler.ts` around lines 356 - 373, The guard condition for the tokenUsage fallback block checking `!usageMetadata && output.llmOutput?.tokenUsage` does not account for whether llmOutput.usage was already processed. If both llmOutput.usage and llmOutput.tokenUsage exist, the tokenUsage block will execute and overwrite the attributes (ATTR_GEN_AI_USAGE_INPUT_TOKENS, ATTR_GEN_AI_USAGE_OUTPUT_TOKENS, and GEN_AI_USAGE_TOTAL_TOKENS) that were already set by the prior llmOutput.usage block. Add an additional guard check for `!output.llmOutput.usage` to the if condition so the tokenUsage fallback only runs when neither usageMetadata nor llmOutput.usage have been processed.
🧹 Nitpick comments (1)
packages/instrumentation-openai/src/instrumentation.ts (1)
972-978: ⚡ Quick winUse explicit type guard for consistency with surrounding code.
Lines 953-958 use
typeof inputTokens === "number"andtypeof outputTokens === "number"to guard attribute setting, ensuring attributes are set even when values are0. Line 973 usesif (cachedTokens)which is falsy for0, meaning the attribute won't be set when zero tokens are cached.While omitting the attribute when there are zero cached tokens may be semantically acceptable ("cache not used"), it's inconsistent with how the function handles other token counts and could cause downstream aggregators to treat "absent" differently from "zero."
Align with the existing type-guard pattern
- const cachedTokens = result.usage.input_tokens_details?.cached_tokens; - if (cachedTokens) { + const cachedTokens = result.usage.input_tokens_details?.cached_tokens; + if (typeof cachedTokens === "number") { span.setAttribute(🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/instrumentation-openai/src/instrumentation.ts` around lines 972 - 978, The code at line 973 uses a falsy check `if (cachedTokens)` to guard the ATTR_GEN_AI_USAGE_CACHE_READ_INPUT_TOKENS attribute setting, which is inconsistent with the pattern used earlier in the function for inputTokens and outputTokens (lines 953-958) that use explicit type guards like `typeof inputTokens === "number"`. Replace the `if (cachedTokens)` condition with `typeof cachedTokens === "number"` to match the existing type-guard pattern and ensure the attribute is properly set even when the cached tokens value is 0.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@packages/instrumentation-langchain/src/callback_handler.ts`:
- Around line 356-373: The guard condition for the tokenUsage fallback block
checking `!usageMetadata && output.llmOutput?.tokenUsage` does not account for
whether llmOutput.usage was already processed. If both llmOutput.usage and
llmOutput.tokenUsage exist, the tokenUsage block will execute and overwrite the
attributes (ATTR_GEN_AI_USAGE_INPUT_TOKENS, ATTR_GEN_AI_USAGE_OUTPUT_TOKENS, and
GEN_AI_USAGE_TOTAL_TOKENS) that were already set by the prior llmOutput.usage
block. Add an additional guard check for `!output.llmOutput.usage` to the if
condition so the tokenUsage fallback only runs when neither usageMetadata nor
llmOutput.usage have been processed.
---
Nitpick comments:
In `@packages/instrumentation-openai/src/instrumentation.ts`:
- Around line 972-978: The code at line 973 uses a falsy check `if
(cachedTokens)` to guard the ATTR_GEN_AI_USAGE_CACHE_READ_INPUT_TOKENS attribute
setting, which is inconsistent with the pattern used earlier in the function for
inputTokens and outputTokens (lines 953-958) that use explicit type guards like
`typeof inputTokens === "number"`. Replace the `if (cachedTokens)` condition
with `typeof cachedTokens === "number"` to match the existing type-guard pattern
and ensure the attribute is properly set even when the cached tokens value is 0.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 574594ea-d228-45dc-b03c-7e859e58e68a
📒 Files selected for processing (8)
packages/instrumentation-bedrock/tests/anthropic.test.tspackages/instrumentation-bedrock/tests/cache-token-fold-in.test.tspackages/instrumentation-langchain/src/callback_handler.tspackages/instrumentation-langchain/test/cache-token-fold-in.test.tspackages/instrumentation-openai/src/instrumentation.tspackages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-set-cache-read-input-tokens-in-span-for-responses-with-cached-tokens_3930499540/recording.harpackages/instrumentation-together/test/instrumentation.test.tspackages/instrumentation-vertexai/tests/gemini.test.ts
✅ Files skipped from review due to trivial changes (1)
- packages/instrumentation-openai/test/recordings/Test-OpenAI-instrumentation_1770406427/should-set-cache-read-input-tokens-in-span-for-responses-with-cached-tokens_3930499540/recording.har
🚧 Files skipped from review as they are similar to previous changes (5)
- packages/instrumentation-together/test/instrumentation.test.ts
- packages/instrumentation-bedrock/tests/anthropic.test.ts
- packages/instrumentation-bedrock/tests/cache-token-fold-in.test.ts
- packages/instrumentation-langchain/test/cache-token-fold-in.test.ts
- packages/instrumentation-vertexai/tests/gemini.test.ts
Per the OpenTelemetry GenAI semantic conventions, cache_read.input_tokens
and cache_creation.input_tokens should be a subset of input_tokens — i.e.,
input_tokens is the total billed input volume, with cache attributes
providing per-line-item detail. The JS instrumentations were diverging
from the spec and from the Python SDK, causing downstream aggregators to
under-count Anthropic input across SDK-JS traffic.
Changes:
cache_creation_input_tokens into gen_ai.usage.input_tokens and
gen_ai.usage.total_tokens. Streaming inherits via shared _endSpan path.
response-body handler.
(input_token_details.cache_read / cache_creation) from generation
messages and emit them as separate attributes. langchain-core's contract
documents input_tokens as already-summed, so no fold-in is needed there.
Legacy llmOutput.usage / tokenUsage paths preserved as fallback.
Cache attributes continue to be emitted separately for line-item visibility;
only the value written to input_tokens changes.
Summary by CodeRabbit