fix: use block_number/block_offset to uniquely identify log rows by karl-power · Pull Request #2071 · hyperdxio/hyperdx

karl-power · 2026-04-08T16:50:42Z

Summary

Fixes an issue where clicking a log row to expand it could show the wrong row's details when multiple rows share identical visible column values (Timestamp, ServiceName, Body, etc.).

Root cause: The detail panel query reconstructs a WHERE clause from only the visible columns, without the original search filters. When rows collide on those columns, LIMIT 1 returns whichever row ClickHouse finds first — not necessarily the one clicked.

Fix: Detect and inject ClickHouse's built-in _block_number and _block_offset virtual columns as hidden tiebreakers in the row SELECT. These are available when the table engine was created with enable_block_number_column = 1 and enable_block_offset_column = 1. Since the row WHERE clause builder already iterates all columns in the row data, the block columns are automatically included in detail queries — disambiguating rows without surfacing them in the UI.

A fallback to __hdx_id (a custom materialized column from the prior approach) is retained for any tables that already have that column.

No schema migration required for the primary fix — the approach relies entirely on engine-level settings that already exist on the otel_logs table.

Key changes

DBRowTable.tsx: Rename appendSelectWithPrimaryAndPartitionKey → appendSelectWithAdditionalKeys adding an extraKeys: string[] param. Rename useConfigWithPrimaryAndPartitionKey → useConfigWithAdditionalSelect; when a sourceId is present (row-level query), inspect engine_full metadata for block column flags and inject _block_number/_block_offset as extra keys, falling back to __hdx_id if present.
types.ts / source.ts / SourceForm.tsx: Remove uniqueRowIdExpression — the tiebreaker is now auto-detected from the table engine, not manually configured.
Schema seed: Remove the __hdx_id materialized column addition from otel_logs; the block column engine flags are sufficient.
Tests: New appendSelectWithAdditionalKeys cases covering extraKeys append, deduplication against primary/partition keys, and array-style selects. Scaffold for inferTableSourceConfig tests.

How to test locally

Reproduce the bug on `main`

Pick any log row from otel_logs. In the ClickHouse UI, run the query that fires when you click a row to open its sidebar.
Insert the same row into otel_logs but change one nested value — e.g. change ResourceAttributes.host.arch from arm64 to arm65.
On the search page, filter for ResourceAttributes.host.arch = arm65. One result appears. Click it to open the sidebar.
The sidebar shows arm64 — it selected the wrong row.

Verify the fix on this branch

Confirm otel_logs was created with enable_block_number_column = 1 and enable_block_offset_column = 1 (check via SHOW CREATE TABLE default.otel_logs).
Repeat steps 1–3 above.
The sidebar now shows arm65 — correct row selected.

Verify fallback path

On a table that has __hdx_id but no block columns, the app still injects __hdx_id as the tiebreaker (same behavior as the previous approach).

References

Linear: Closes HDX-2372, Closes HDX-3489
GitHub: Closes ‘expand log line’ not use all filter. #1155

changeset-bot · 2026-04-08T16:50:47Z

🦋 Changeset detected

Latest commit: 0f75ae4

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 3 packages

Name	Type
@hyperdx/api	Patch
@hyperdx/app	Patch
@hyperdx/otel-collector	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

vercel · 2026-04-08T16:50:49Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
hyperdx-oss	Ready	Preview, Comment	May 1, 2026 1:44am

github-actions · 2026-04-08T16:58:11Z

E2E Test Results

✅ All tests passed • 159 passed • 3 skipped • 1189s

Status	Count
✅ Passed	159
❌ Failed	0
⚠️ Flaky	4
⏭️ Skipped	3

Tests ran across 4 shards in parallel.

View full report →

github-actions · 2026-04-09T13:53:57Z

🔴 Tier 4 — Critical

Touches auth, data models, config, tasks, OTel pipeline, ClickHouse, or CI/CD.

Why this tier:

Critical-path files (1):
- docker/otel-collector/schema/seed/00002_otel_logs.sql
Cross-layer change: touches frontend (packages/app) + backend (packages/api) + shared utils (packages/common-utils)

Review process: Deep review from a domain expert. Synchronous walkthrough may be required.
SLA: Schedule synchronous review within 2 business days.

Stats

Production files changed: 6
Production lines changed: 81 (+ 73 in test files, excluded from tier calculation)
Branch: karl/id-field-in-log-filter
Author: karl-power

To override this classification, remove the review/tier-4 label and apply a different review/tier-* label. Manual overrides are preserved on subsequent pushes.

github-actions · 2026-04-09T13:55:45Z

PR Review

PR #2071 — fix: use block_number/block_offset to uniquely identify log rows

The approach is sound: injecting ClickHouse's virtual block columns as hidden tiebreakers cleanly solves the duplicate-row collision bug without any schema migration for new installs.

⚠️ Fragile engine string matching → engineFull.includes('enable_block_number_column = 1') will silently fall back to __hdx_id (or no tiebreaker) if ClickHouse changes whitespace/formatting in engine_full. Consider a case-insensitive regex or trimming spaces around = to be safe.
⚠️ No migration path for existing deployments → Tables created before this change won't have enable_block_number_column/enable_block_offset_column set; the fix is a no-op for them unless they recreate otel_logs. The fallback to __hdx_id helps only if that materialized column exists. This should be called out in release notes / upgrade docs.
⚠️ sourceId used as a boolean flag → The sourceId?: string parameter in useConfigWithAdditionalSelect is only used to enable/disable the useColumns query. Callers like usePatterns.tsx pass no sourceId (correctly), but the parameter name obscures its true purpose ("am I a row-level query?"). A dedicated isRowQuery?: boolean param would be less surprising to future callers.
✅ uniqueRowIdExpression is fully removed with no dangling references — clean.
✅ New tests cover extraKeys append, dedup against select, dedup against primary keys, and array-style select.
✅ additionalKeysLength count remains correct with extraKeys (hidden keys are properly stripped from the UI via selectColumnMapWithoutAdditionalKeys).

karl-power · 2026-04-09T13:55:53Z

@wrn14897 Do you know if the migrations are needed here? I tried to run them locally which invokes migrate, but I don't see that installed anywhere.

wrn14897

I think we should use the UUID type for the id column (https://clickhouse.com/docs/sql-reference/data-types/uuid). Using toUInt16(rand()) only provides 65,536 possible values, which could lead to collisions.

I’d also suggest raising this with the team, since it involves a schema change. It would be good to make sure we all align on the approach for the ID field before moving forward. cc @knudtty

karl-power · 2026-04-13T14:14:04Z

I think we should use the UUID type for the id column

@MikeShi42 do you have thoughts on this as you originally suggested UInt16?

I'm guessing the reasoning was that UInt16 combined with the other fields from the search query would be enough to reduce the chance of a collision, with the benefit of less storage than UUID

wrn14897 · 2026-04-13T16:00:43Z

I think we should use the UUID type for the id column

@MikeShi42 do you have thoughts on this as you originally suggested UInt16?

I'm guessing the reasoning was that UInt16 combined with the other fields from the search query would be enough to reduce the chance of a collision, with the benefit of less storage than UUID

I understand the reason for reducing storage (to achieve a better compression ratio), and that this ID is only used internally. Also, as you mentioned, the chance of collisions would be lower if the search is always combined with other fields. I just want to confirm that this is intentional

karl-power · 2026-04-16T13:50:53Z

@wrn14897 I have shared with the team but didn't receive any feedback yet. Are you happy to proceed like this or prefer to use uuid?

MikeShi42 · 2026-04-20T14:42:55Z

@karl-power instead of creating a new column, any thoughts on if we should instead just use _block_number and _block_offset and enabling enable_block_number_column and enable_block_offset_column by default in new tables? This will technically allow other features as well within ClickHouse - so more useful than a random id.

We can tell existing users to run an update to add this new random id column (as suggested in this PR) if they haven't set up their table to use block number/offsets.

karl-power · 2026-04-28T02:03:10Z

@karl-power instead of creating a new column, any thoughts on if we should instead just use _block_number and _block_offset and enabling enable_block_number_column and enable_block_offset_column by default in new tables? This will technically allow other features as well within ClickHouse - so more useful than a random id.

We can tell existing users to run an update to add this new random id column (as suggested in this PR) if they haven't set up their table to use block number/offsets.

@MikeShi42 Yeah I think the block_number/offset combo would work. So you're suggesting to remove the UI control for the unique identifier and try to fallback to __hdx_id if the block settings aren't enabled?

karl-power · 2026-05-01T01:49:23Z

@MikeShi42 updated to use block number/offset

vercel Bot deployed to Preview April 8, 2026 16:53 View deployment

karl-power force-pushed the karl/id-field-in-log-filter branch from a7ee44f to 3151b4d Compare April 9, 2026 09:00

vercel Bot deployed to Preview April 9, 2026 09:03 View deployment

karl-power marked this pull request as ready for review April 9, 2026 13:53

github-actions Bot added the review/tier-4 Critical — deep review + domain expert sign-off label Apr 9, 2026

karl-power requested a review from wrn14897 April 9, 2026 13:54

karl-power force-pushed the karl/id-field-in-log-filter branch from 3151b4d to 8772bd3 Compare April 9, 2026 13:59

vercel Bot deployed to Preview April 9, 2026 14:03 View deployment

wrn14897 requested changes Apr 10, 2026

View reviewed changes

karl-power force-pushed the karl/id-field-in-log-filter branch from 8772bd3 to 5d75767 Compare April 15, 2026 16:09

vercel Bot deployed to Preview April 15, 2026 16:12 View deployment

vercel Bot deployed to Preview April 28, 2026 02:06 View deployment

karl-power force-pushed the karl/id-field-in-log-filter branch from d790f62 to 327e41e Compare May 1, 2026 00:50

vercel Bot deployed to Preview May 1, 2026 00:53 View deployment

karl-power changed the title ~~fix: filter logs with ID field~~ fix: use block_number/block_offset to uniquely identify log rows May 1, 2026

karl-power force-pushed the karl/id-field-in-log-filter branch from 327e41e to e44d45a Compare May 1, 2026 01:36

vercel Bot deployed to Preview May 1, 2026 01:39 View deployment

fix: use block_number/block_offset to uniquely identify log rows

0f75ae4

karl-power force-pushed the karl/id-field-in-log-filter branch from e44d45a to 0f75ae4 Compare May 1, 2026 01:41

vercel Bot deployed to Preview May 1, 2026 01:44 View deployment

karl-power requested a review from MikeShi42 May 1, 2026 01:49

Conversation

karl-power commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key changes

How to test locally

Reproduce the bug on main

Verify the fix on this branch

Verify fallback path

References

Uh oh!

changeset-bot Bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

vercel Bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Test Results

Uh oh!

github-actions Bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔴 Tier 4 — Critical

Uh oh!

github-actions Bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review

Uh oh!

karl-power commented Apr 9, 2026

Uh oh!

wrn14897 left a comment

Choose a reason for hiding this comment

Uh oh!

karl-power commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wrn14897 commented Apr 13, 2026

Uh oh!

karl-power commented Apr 16, 2026

Uh oh!

MikeShi42 commented Apr 20, 2026

Uh oh!

karl-power commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

karl-power commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

karl-power commented Apr 8, 2026 •

edited

Loading

Reproduce the bug on `main`

changeset-bot Bot commented Apr 8, 2026 •

edited

Loading

vercel Bot commented Apr 8, 2026 •

edited

Loading

github-actions Bot commented Apr 8, 2026 •

edited

Loading

github-actions Bot commented Apr 9, 2026 •

edited

Loading

github-actions Bot commented Apr 9, 2026 •

edited

Loading

karl-power commented Apr 13, 2026 •

edited

Loading

karl-power commented Apr 28, 2026 •

edited

Loading