Skip to content

Support for org-scoped ClickHouse#3333

Draft
matt-aitken wants to merge 23 commits intomainfrom
org-scoped-clickhouse
Draft

Support for org-scoped ClickHouse#3333
matt-aitken wants to merge 23 commits intomainfrom
org-scoped-clickhouse

Conversation

@matt-aitken
Copy link
Copy Markdown
Member

Added OrganizationDataStore which allows orgs to have data stored in specific separate services.

For now this is just used for ClickHouse. When using ClickHouse we get a client for the factory and pass in the org id.

Particular care has to be made with two hot-insert paths:

  1. RunReplicationService
  2. OTLPExporter

@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Apr 7, 2026

⚠️ No Changeset found

Latest commit: de2996d

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 7, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Replaces module-level ClickHouse singletons with a ClickhouseFactory that returns per-organization ClickHouse clients and event repositories. Adds OrganizationDataStore model and migration, ClickHouse secret schemas, OrganizationDataStoresRegistry with background reload/retry, admin UI/routes to manage data stores, and new clickhouse factory/instance modules and test helpers. Refactors event repository interfaces/implementations (metrics batching and insert APIs), updates many presenters, routes, services, and tests to use organization-scoped clients, and introduces related tests, env var, docs, and minor misc changes.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description explains the core feature (OrganizationDataStore for org-specific services, initially ClickHouse) and notes critical areas requiring attention (RunReplicationService, OTLPExporter), but does not follow the provided template structure. Add template sections: Closes #, Checklist with contributing guide confirmation, Testing steps, Changelog, and Screenshots as applicable.
Docstring Coverage ⚠️ Warning Docstring coverage is 23.08% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Support for org-scoped ClickHouse' clearly describes the main change: adding organization-specific ClickHouse client support.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch org-scoped-clickhouse
⚔️ Resolve merge conflicts
  • Resolve merge conflict in branch org-scoped-clickhouse

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 16

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.logs/route.tsx (1)

137-159: ⚠️ Potential issue | 🟠 Major

Don't let ClickHouse resolution bypass the deferred error path.

getClickhouseForOrganization() is now awaited before typeddefer(). If org datastore lookup or client initialization fails, the loader throws before TypedAwait can render its errorElement, and the page shell won't stream until that lookup finishes. Fold the factory lookup into listPromise so it shares the existing deferred failure handling.

♻️ Suggested shape
-  const logsClickhouse = await clickhouseFactory.getClickhouseForOrganization(project.organizationId, "logs");
-  const presenter = new LogsListPresenter($replica, logsClickhouse);
-
-  const listPromise = presenter
-    .call(project.organizationId, environment.id, {
-      userId,
-      projectId: project.id,
-      tasks: tasks.length > 0 ? tasks : undefined,
-      runId,
-      search,
-      levels,
-      period,
-      from,
-      to,
-      defaultPeriod: "1h",
-      retentionLimitDays
-    })
+  const listPromise = clickhouseFactory
+    .getClickhouseForOrganization(project.organizationId, "logs")
+    .then((logsClickhouse) =>
+      new LogsListPresenter($replica, logsClickhouse).call(project.organizationId, environment.id, {
+        userId,
+        projectId: project.id,
+        tasks: tasks.length > 0 ? tasks : undefined,
+        runId,
+        search,
+        levels,
+        period,
+        from,
+        to,
+        defaultPeriod: "1h",
+        retentionLimitDays,
+      })
+    )
     .catch((error) => {
       if (error instanceof ServiceValidationError) {
         return { error: error.message };
       }
       throw error;
     });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/webapp/app/routes/_app.orgs`.$organizationSlug.projects.$projectParam.env.$envParam.logs/route.tsx
around lines 137 - 159, The ClickHouse client lookup is awaited up front which
can throw before the deferred error handling; instead, defer the factory call so
failures are handled by the same promise chain: build listPromise by calling
clickhouseFactory.getClickhouseForOrganization(...) inside the async chain used
to create the LogsListPresenter and call presenter.call (keep references to
clickhouseFactory.getClickhouseForOrganization, LogsListPresenter,
presenter.call, and listPromise) so any errors from the org datastore/client
init are caught by the existing .catch that handles ServiceValidationError and
rethrows other errors, rather than throwing before TypedAwait renders its
errorElement.
apps/webapp/app/routes/admin.api.v1.runs-replication.create.ts (1)

13-17: ⚠️ Potential issue | 🟠 Major

Don't silently drop the connection-tuning fields.

Lines 13-17 still require keepAliveEnabled, keepAliveIdleSocketTtl, and maxOpenConnections, but Lines 82-119 no longer pass them anywhere. This endpoint will accept those values and return success even though they no longer affect replication behavior. Either wire them into the new factory path or remove them from the request schema before merge.

One fix if these settings are no longer supported
 const CreateRunReplicationServiceParams = z.object({
   name: z.string(),
-  keepAliveEnabled: z.boolean(),
-  keepAliveIdleSocketTtl: z.number(),
-  maxOpenConnections: z.number(),
   maxFlushConcurrency: z.number(),
   flushIntervalMs: z.number(),
   flushBatchSize: z.number(),

Also applies to: 82-93, 95-119

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/routes/admin.api.v1.runs-replication.create.ts` around lines
13 - 17, The request schema CreateRunReplicationServiceParams still requires
keepAliveEnabled, keepAliveIdleSocketTtl, and maxOpenConnections but the handler
that constructs the replication service (the new factory path used in the run
replication creation flow) no longer consumes these fields, so they are
effectively ignored; either (A) thread these three fields into the replication
service factory/constructor/creation call (the function that builds the
replication client/service used in the handler) so the values are applied,
making sure to map the names exactly, or (B) remove these fields from
CreateRunReplicationServiceParams (and any validation/usage sites) so the API no
longer accepts unused settings; locate the schema
CreateRunReplicationServiceParams and the replication service
factory/constructor function referenced in the create run replication handler to
implement one of these fixes.
🧹 Nitpick comments (4)
apps/webapp/test/runsReplicationService.part2.test.ts (1)

26-27: Cover org-to-ClickHouse routing with a multi-store case.

All of these setups hand the service a factory backed by a single ClickHouse instance. That means the suite still passes if RunsReplicationService sends the wrong organization ID into the factory, which is the new behavior this change depends on. Add one case that maps two orgs to different clients and asserts each org's rows land in the correct backend.

Also applies to: 46-47, 155-156, 270-271, 391-392, 525-526, 632-633, 801-802, 926-927, 1139-1140

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/test/runsReplicationService.part2.test.ts` around lines 26 - 27,
The tests currently instantiate RunsReplicationService with a
TestReplicationClickhouseFactory backed by a single ClickHouse instance, so they
don't catch org-to-ClickHouse routing bugs; update the affected test cases
(instances where new RunsReplicationService is constructed with new
TestReplicationClickhouseFactory(clickhouse)) to create a multi-store scenario:
build two distinct Test ClickHouse clients (e.g., clickhouseA and clickhouseB),
construct a TestReplicationClickhouseFactory that maps orgA -> clickhouseA and
orgB -> clickhouseB, pass that factory into the RunsReplicationService, then
insert rows for both orgs and assert each org's rows appear in the corresponding
backend client (use the same helper assertions already used elsewhere in the
tests). Ensure you change all occurrences referenced (the other construction
sites of TestReplicationClickhouseFactory) so the suite validates org routing
across multiple stores.
apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts (1)

94-130: Validate and type config at this service boundary.

These methods accept config: any and persist it directly. Any caller that skips route-level validation can store a payload that later makes loadFromDatabase() skip the datastore entirely. Parse config here with the schema for kind, and replace any with the inferred type.

As per coding guidelines, "Use zod for validation in packages/core and apps/webapp."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts`
around lines 94 - 130, addDataStore and updateDataStore accept config: any and
persist it directly; validate and type the config at this service boundary using
the zod schema for the DataStoreKind before storing secrets or writing to
Prisma. Locate the methods addDataStore and updateDataStore and replace the
loose any type with the inferred TypeScript type from the zod parser; run the
appropriate zod.parse/partial-parse for the provided kind (e.g.,
schemaForKind(kind)) and throw or return a clear error on parse failure, then
use the validated config when calling getSecretStore().setSecret(secretKey, ...)
and when writing config to this._prisma.organizationDataStore.create/update;
ensure loadFromDatabase behavior is preserved by storing only the validated
shape (or a secured secretKey) and include error handling to surface validation
failures.
apps/webapp/app/services/clickhouse/clickhouseFactory.server.ts (2)

336-389: Consider reducing duplication between clickhouse and clickhouse_v2 cases.

The two cases share ~95% identical configuration. Extracting common options would improve maintainability.

♻️ Proposed refactor
 function buildEventRepository(store: string, clickhouse: ClickHouse): ClickhouseEventRepository {
+  const commonOptions = {
+    clickhouse,
+    batchSize: env.EVENTS_CLICKHOUSE_BATCH_SIZE,
+    flushInterval: env.EVENTS_CLICKHOUSE_FLUSH_INTERVAL_MS,
+    maximumTraceSummaryViewCount: env.EVENTS_CLICKHOUSE_MAX_TRACE_SUMMARY_VIEW_COUNT,
+    maximumTraceDetailedSummaryViewCount: env.EVENTS_CLICKHOUSE_MAX_TRACE_DETAILED_SUMMARY_VIEW_COUNT,
+    maximumLiveReloadingSetting: env.EVENTS_CLICKHOUSE_MAX_LIVE_RELOADING_SETTING,
+    insertStrategy: env.EVENTS_CLICKHOUSE_INSERT_STRATEGY,
+    waitForAsyncInsert: env.EVENTS_CLICKHOUSE_WAIT_FOR_ASYNC_INSERT === "1",
+    asyncInsertMaxDataSize: env.EVENTS_CLICKHOUSE_ASYNC_INSERT_MAX_DATA_SIZE,
+    asyncInsertBusyTimeoutMs: env.EVENTS_CLICKHOUSE_ASYNC_INSERT_BUSY_TIMEOUT_MS,
+    llmMetricsBatchSize: env.LLM_METRICS_BATCH_SIZE,
+    llmMetricsFlushInterval: env.LLM_METRICS_FLUSH_INTERVAL_MS,
+    llmMetricsMaxBatchSize: env.LLM_METRICS_MAX_BATCH_SIZE,
+    llmMetricsMaxConcurrency: env.LLM_METRICS_MAX_CONCURRENCY,
+    otlpMetricsBatchSize: env.METRICS_CLICKHOUSE_BATCH_SIZE,
+    otlpMetricsFlushInterval: env.METRICS_CLICKHOUSE_FLUSH_INTERVAL_MS,
+    otlpMetricsMaxConcurrency: env.METRICS_CLICKHOUSE_MAX_CONCURRENCY,
+  };
+
   switch (store) {
     case "clickhouse": {
       return new ClickhouseEventRepository({
-        clickhouse,
-        batchSize: env.EVENTS_CLICKHOUSE_BATCH_SIZE,
-        // ... all the options
+        ...commonOptions,
+        startTimeMaxAgeMs: env.EVENTS_CLICKHOUSE_START_TIME_MAX_AGE_MS,
         version: "v1",
       });
     }
     case "clickhouse_v2": {
       return new ClickhouseEventRepository({
-        clickhouse: clickhouse,
-        batchSize: env.EVENTS_CLICKHOUSE_BATCH_SIZE,
-        // ... all the options
+        ...commonOptions,
         version: "v2",
       });
     }
     default: {
       throw new Error(`Unknown ClickHouse event repository store: ${store}`);
     }
   }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/services/clickhouse/clickhouseFactory.server.ts` around lines
336 - 389, The switch in buildEventRepository duplicates nearly identical
ClickhouseEventRepository construction for "clickhouse" and "clickhouse_v2";
refactor by extracting the shared options into a single commonOptions object
(populate fields like clickhouse, batchSize, flushInterval,
maximumTraceSummaryViewCount, maximumTraceDetailedSummaryViewCount,
maximumLiveReloadingSetting, insertStrategy, waitForAsyncInsert,
asyncInsertMaxDataSize, asyncInsertBusyTimeoutMs, startTimeMaxAgeMs (only
present for v1), llmMetrics*, otlpMetrics*, etc.), then call new
ClickhouseEventRepository({...commonOptions, version: "v1"}) for "clickhouse"
and new ClickhouseEventRepository({...commonOptions, version: "v2"}) for
"clickhouse_v2"); ensure you preserve startTimeMaxAgeMs presence only where
required and keep explicit clickhouse property name to match the constructor.

164-179: Org-scoped clients use uniform configuration regardless of clientType.

Unlike the default clients where logs and replication have specialized settings (e.g., clickhouseSettings for logs, RUN_REPLICATION_* env vars for replication), all org-scoped clients use the same generic CLICKHOUSE_* configuration. If org-specific logs or replication clients require different settings (like max_memory_usage for logs queries), this could cause performance or resource issues.

Consider whether org-scoped clients need type-specific configurations, or document that org data stores intentionally use simplified settings.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/services/clickhouse/clickhouseFactory.server.ts` around lines
164 - 179, buildOrgClickhouseClient currently creates org-scoped ClickHouse
clients with the same generic CLICKHOUSE_* config regardless of ClientType;
update it to apply type-specific settings (e.g., when clientType === "logs"
include clickhouseSettings like max_memory_usage or when clientType ===
"replication" honor RUN_REPLICATION_* env vars) or add a clear comment/docstring
stating that org clients intentionally use simplified settings. Locate
buildOrgClickhouseClient and branch on the ClientType argument to merge the same
specialized options used for default clients (name already contains clientType)
or document the design decision so callers know org clients lack type-specific
tuning.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/webapp/app/env.server.ts`:
- Around line 1304-1305: The env schema allows
ORGANIZATION_DATA_STORES_RELOAD_INTERVAL_MS to be zero or negative, enabling
tight-loop reloads; update the zod schema for
ORGANIZATION_DATA_STORES_RELOAD_INTERVAL_MS to enforce sensible bounds (e.g.,
.int().min(60000) to require at least 1 minute and optionally .max(86400000) to
cap at 24 hours) while keeping the existing default (5 * 60 * 1000); use
z.coerce.number().int().min(...).max(...).default(...) or a .refine validator if
you prefer custom logic so invalid env values fail validation.

In
`@apps/webapp/app/routes/_app.orgs`.$organizationSlug.projects.$projectParam.env.$envParam.errors.$fingerprint/route.tsx:
- Around line 248-253: The presenter is being given the same ClickHouse "query"
client twice; call clickhouseFactory.getClickhouseForOrganization a second time
with the logs-scoped role (e.g., "logs") to obtain a logs client and pass that
as the presenter’s third argument so ErrorGroupPresenter($replica, queryClient,
logsClient) receives a query-scoped client and a logs-scoped client respectively
(identify the factory call clickhouseFactory.getClickhouseForOrganization and
the constructor ErrorGroupPresenter to locate where to change).

In
`@apps/webapp/app/routes/_app.orgs`.$organizationSlug.projects.$projectParam.env.$envParam.models.$modelId/route.tsx:
- Around line 71-72: The ModelRegistryPresenter is being instantiated with only
ClickHouse but it needs the Prisma replica client for methods like
getModelDetail(); update the instantiation where you call new
ModelRegistryPresenter(clickhouse) to pass the Prisma replica (named $replica)
as the second argument, ensure you import $replica from ~/db.server if not
already imported, and verify presenter methods (e.g., getModelDetail,
getUserMetrics) now use the provided clients.

In
`@apps/webapp/app/routes/_app.orgs`.$organizationSlug.projects.$projectParam.env.$envParam.test.tasks.$taskParam/route.tsx:
- Around line 105-106: Move the call to
clickhouseFactory.getClickhouseForOrganization(...) so it executes inside the
loader's existing try block before constructing the TestTaskPresenter;
specifically ensure the clickhouse lookup and assignment to clickhouse happen
inside the try that wraps presenter.call(), then instantiate presenter = new
TestTaskPresenter($replica, clickhouse) after the successful lookup so any
errors from getClickhouseForOrganization are caught, logged, and handled by the
existing redirect/path in the catch.

In `@apps/webapp/app/routes/admin.data-stores.tsx`:
- Line 31: Update the import of tryCatch in the admin.data-stores route to use
the package subpath export: replace the current root import of tryCatch (import
{ tryCatch } from "@trigger.dev/core") with the subpath import from
"@trigger.dev/core/utils" so the symbol tryCatch is imported via the utils
subpath.
- Around line 88-137: ClickhouseConnectionSchema.parse() is used directly in the
"add" and "update" action handlers and will throw on invalid ClickHouse URLs
before the surrounding tryCatch runs; replace both usages with
ClickhouseConnectionSchema.safeParse(connectionObject) and check the returned
.success flag, returning typedjson({ error: validation.error.message }, {
status: 400 }) (or similar) when validation fails, then pass the validated value
to organizationDataStoresRegistry.addDataStore / updateDataStore; locate the
parse call sites in the "add" branch (where ClickhouseConnectionSchema.parse is
called before calling organizationDataStoresRegistry.addDataStore) and the
"update" branch (conditional parse before
organizationDataStoresRegistry.updateDataStore) and apply the safeParse pattern
and explicit error handling there.
- Line 1: The component currently calls onOpenChange(false) during render when
an operation reports success (the direct call to onOpenChange in the
render/JSX); move this side-effect into a useEffect so it runs after render
instead: remove the onOpenChange(false) call from the render/JSX, add a
useEffect that watches the isSuccess (or equivalent success flag) and calls
onOpenChange(false) when it becomes true, and include onOpenChange and the
success flag in the effect dependency array to avoid stale closures. Ensure you
reference the same success state variable and the onOpenChange prop used in the
component (replace the inline call with the new effect).

In `@apps/webapp/app/routes/otel.v1.metrics.ts`:
- Line 10: The code eagerly awaits otlpExporter (const exporter = await
otlpExporter) before validating the request content-type, causing unnecessary
exporter startup and possible init failures for requests that should return 400;
move the await of otlpExporter so it only runs after the content-type check (the
branch that returns the unsupported-content-type response) so malformed/probe
traffic never triggers exporter initialization — locate the otlpExporter await
in the handler in otel.v1.metrics.ts and only resolve otlpExporter after the
content-type validation code path.

In `@apps/webapp/app/runEngine/services/triggerFailedTask.server.ts`:
- Around line 70-74: The current write selects repository/store via
getEventRepository but only records taskEventStore in the run model, which will
break reads if an org is re-pointed; update the write path in
triggerFailedTask.server.ts so that when calling getEventRepository you also
persist the concrete OrganizationDataStore identifier (e.g., the returned
repository/store metadata or an OrganizationDataStore enum/value) into the run
model alongside taskEventStore (or alternatively mark the org's datastore
assignment immutable until a migration flag is set); specifically, modify the
code that assigns repository/store and the run creation/update logic to save the
concrete datastore binding (referencing getEventRepository, repository, store,
taskEventStore, and OrganizationDataStore) so future reads resolve the original
backend used for those writes.

In `@apps/webapp/app/services/clickhouse/clickhouseFactory.server.ts`:
- Around line 313-334: getEventsClickhouseClient currently instantiates a new
ClickHouse on each call; change it to use the existing singleton pattern: create
a singleton holder (e.g., const defaultEventsClickhouseClient = singleton(() =>
new ClickHouse({...same options...}))) and have getEventsClickhouseClient return
defaultEventsClickhouseClient() (or inline return singleton(() => new
ClickHouse(...))()). Keep the same URL parsing and options
(url.searchParams.delete("secure"), keepAlive, compression, etc.) but ensure
only one ClickHouse instance is created and reused across calls.
- Around line 362-384: The clickhouse_v2 case building the
ClickhouseEventRepository is missing the startTimeMaxAgeMs config that the
clickhouse case passes; update the clickhouse_v2 branch where
ClickhouseEventRepository is constructed to include startTimeMaxAgeMs:
env.EVENTS_CLICKHOUSE_START_TIME_MAX_AGE_MS so the same start-time
clamping/validation used by the v1 config is applied to v2 as well.

In
`@apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts`:
- Around line 72-75: The loop that builds the lookup map silently overwrites
keys produced from `${orgId}:${row.kind}`, causing nondeterministic behavior
when an org is assigned to multiple stores; before calling lookup.set(key,
parsed) check lookup.has(key) and if present throw a clear error (or assert)
that includes the conflicting orgId, kind, and identifiers for both the existing
entry and the current row (e.g., existingParsed/id and parsed/id) so overlapping
assignments discovered during findMany() fail fast and surface which rows/IDs
conflict.
- Around line 94-157: The addDataStore, updateDataStore and deleteDataStore
methods only write to the DB but don't update the in-memory registry map
(_lookup), leaving routing stale; after each successful mutation (after the
secret write and after the prisma create/update/delete) call the registry's
method that repopulates the in-memory map used by get() (the same function the
background loader uses — e.g. this.reload() / this._reloadLookup() /
this._refreshLookup(), whichever exists) so _lookup is refreshed immediately and
subsequent get() calls see the new data.

In `@apps/webapp/app/services/queryService.server.ts`:
- Around line 278-279: The concurrency limiter is currently keyed by projectId
but the ClickHouse client and execution are organization-scoped (see
clickhouseFactory.getClickhouseForOrganization and executeTSQL), so update the
limiter acquire/release calls (where limiter.acquire and limiter.release are
used around query execution) to use organizationId instead of projectId so
concurrent queries are limited per organization rather than per project; ensure
any helper that constructs limiter keys or caches limiter instances uses
organizationId consistently in the same places referenced around the query
execution.

In `@apps/webapp/app/services/runsReplicationService.server.ts`:
- Around line 655-685: The current loop serializes inserts per ClickHouse group
because each await blocks the next; change it to run per-group flushes
concurrently with a bounded concurrency limiter (e.g., p-limit or an internal
semaphore) so slow/unhealthy ClickHouse targets don't stall unrelated orgs. For
each group (the items handled around sortTaskRunInserts, sortPayloadInserts,
combinedTaskRunInserts/combinedPayloadInserts and using this.#insertWithRetry
which calls this.#insertTaskRunInserts and this.#insertPayloadInserts with
flushId), create an async task that performs the two insertWithRetry calls,
records the first taskRunError/payloadError seen, and updates
_taskRunsInsertedCounter/_payloadsInsertedCounter when each insert succeeds;
schedule those tasks through the limiter and await Promise.all on the scheduled
tasks; ensure you keep pushing group.taskRunInserts and group.payloadInserts
into combined* before scheduling so semantics remain the same.

In `@apps/webapp/app/v3/otlpExporter.server.ts`:
- Around line 1207-1215: The initializer initializeOTLPExporter is reading
settings directly from process.env; change it to use the typed env export
instead: replace process.env.OTLP_EXPORTER_VERBOSE with
env.OTLP_EXPORTER_VERBOSE (coerce to "1" check as before) and replace
process.env.SERVER_OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT with
env.SERVER_OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT and parseInt that value,
falling back to 8192; ensure the existing env import is used (or add it if
missing) and keep clickhouseFactory and OTLPExporter usage unchanged.

---

Outside diff comments:
In
`@apps/webapp/app/routes/_app.orgs`.$organizationSlug.projects.$projectParam.env.$envParam.logs/route.tsx:
- Around line 137-159: The ClickHouse client lookup is awaited up front which
can throw before the deferred error handling; instead, defer the factory call so
failures are handled by the same promise chain: build listPromise by calling
clickhouseFactory.getClickhouseForOrganization(...) inside the async chain used
to create the LogsListPresenter and call presenter.call (keep references to
clickhouseFactory.getClickhouseForOrganization, LogsListPresenter,
presenter.call, and listPromise) so any errors from the org datastore/client
init are caught by the existing .catch that handles ServiceValidationError and
rethrows other errors, rather than throwing before TypedAwait renders its
errorElement.

In `@apps/webapp/app/routes/admin.api.v1.runs-replication.create.ts`:
- Around line 13-17: The request schema CreateRunReplicationServiceParams still
requires keepAliveEnabled, keepAliveIdleSocketTtl, and maxOpenConnections but
the handler that constructs the replication service (the new factory path used
in the run replication creation flow) no longer consumes these fields, so they
are effectively ignored; either (A) thread these three fields into the
replication service factory/constructor/creation call (the function that builds
the replication client/service used in the handler) so the values are applied,
making sure to map the names exactly, or (B) remove these fields from
CreateRunReplicationServiceParams (and any validation/usage sites) so the API no
longer accepts unused settings; locate the schema
CreateRunReplicationServiceParams and the replication service
factory/constructor function referenced in the create run replication handler to
implement one of these fixes.

---

Nitpick comments:
In `@apps/webapp/app/services/clickhouse/clickhouseFactory.server.ts`:
- Around line 336-389: The switch in buildEventRepository duplicates nearly
identical ClickhouseEventRepository construction for "clickhouse" and
"clickhouse_v2"; refactor by extracting the shared options into a single
commonOptions object (populate fields like clickhouse, batchSize, flushInterval,
maximumTraceSummaryViewCount, maximumTraceDetailedSummaryViewCount,
maximumLiveReloadingSetting, insertStrategy, waitForAsyncInsert,
asyncInsertMaxDataSize, asyncInsertBusyTimeoutMs, startTimeMaxAgeMs (only
present for v1), llmMetrics*, otlpMetrics*, etc.), then call new
ClickhouseEventRepository({...commonOptions, version: "v1"}) for "clickhouse"
and new ClickhouseEventRepository({...commonOptions, version: "v2"}) for
"clickhouse_v2"); ensure you preserve startTimeMaxAgeMs presence only where
required and keep explicit clickhouse property name to match the constructor.
- Around line 164-179: buildOrgClickhouseClient currently creates org-scoped
ClickHouse clients with the same generic CLICKHOUSE_* config regardless of
ClientType; update it to apply type-specific settings (e.g., when clientType ===
"logs" include clickhouseSettings like max_memory_usage or when clientType ===
"replication" honor RUN_REPLICATION_* env vars) or add a clear comment/docstring
stating that org clients intentionally use simplified settings. Locate
buildOrgClickhouseClient and branch on the ClientType argument to merge the same
specialized options used for default clients (name already contains clientType)
or document the design decision so callers know org clients lack type-specific
tuning.

In
`@apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts`:
- Around line 94-130: addDataStore and updateDataStore accept config: any and
persist it directly; validate and type the config at this service boundary using
the zod schema for the DataStoreKind before storing secrets or writing to
Prisma. Locate the methods addDataStore and updateDataStore and replace the
loose any type with the inferred TypeScript type from the zod parser; run the
appropriate zod.parse/partial-parse for the provided kind (e.g.,
schemaForKind(kind)) and throw or return a clear error on parse failure, then
use the validated config when calling getSecretStore().setSecret(secretKey, ...)
and when writing config to this._prisma.organizationDataStore.create/update;
ensure loadFromDatabase behavior is preserved by storing only the validated
shape (or a secured secretKey) and include error handling to surface validation
failures.

In `@apps/webapp/test/runsReplicationService.part2.test.ts`:
- Around line 26-27: The tests currently instantiate RunsReplicationService with
a TestReplicationClickhouseFactory backed by a single ClickHouse instance, so
they don't catch org-to-ClickHouse routing bugs; update the affected test cases
(instances where new RunsReplicationService is constructed with new
TestReplicationClickhouseFactory(clickhouse)) to create a multi-store scenario:
build two distinct Test ClickHouse clients (e.g., clickhouseA and clickhouseB),
construct a TestReplicationClickhouseFactory that maps orgA -> clickhouseA and
orgB -> clickhouseB, pass that factory into the RunsReplicationService, then
insert rows for both orgs and assert each org's rows appear in the corresponding
backend client (use the same helper assertions already used elsewhere in the
tests). Ensure you change all occurrences referenced (the other construction
sites of TestReplicationClickhouseFactory) so the suite validates org routing
across multiple stores.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 3a31161c-3ac9-4af9-bcd2-f229a19f7a61

📥 Commits

Reviewing files that changed from the base of the PR and between def21b2 and a60f2cb.

📒 Files selected for processing (69)
  • .cursor/mcp.json
  • .server-changes/organization-scoped-clickhouse.md
  • CLAUDE.md
  • apps/webapp/app/env.server.ts
  • apps/webapp/app/presenters/v3/ApiRunListPresenter.server.ts
  • apps/webapp/app/presenters/v3/CreateBulkActionPresenter.server.ts
  • apps/webapp/app/presenters/v3/RunPresenter.server.ts
  • apps/webapp/app/presenters/v3/RunTagListPresenter.server.ts
  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • apps/webapp/app/presenters/v3/TaskListPresenter.server.ts
  • apps/webapp/app/presenters/v3/UsagePresenter.server.ts
  • apps/webapp/app/presenters/v3/ViewSchedulePresenter.server.ts
  • apps/webapp/app/presenters/v3/WaitpointPresenter.server.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.dashboards.$dashboardKey/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.errors.$fingerprint/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.errors._index/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.logs/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.models.$modelId/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.models._index/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.models.compare/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.prompts.$promptSlug/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.prompts._index/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.runs.$runParam/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.runs._index/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.test.tasks.$taskParam/route.tsx
  • apps/webapp/app/routes/admin.api.v1.runs-replication.create.ts
  • apps/webapp/app/routes/admin.api.v1.runs-replication.start.ts
  • apps/webapp/app/routes/admin.data-stores.tsx
  • apps/webapp/app/routes/admin.tsx
  • apps/webapp/app/routes/api.v1.prompts.$slug.ts
  • apps/webapp/app/routes/api.v1.prompts.$slug.versions.ts
  • apps/webapp/app/routes/api.v1.prompts._index.ts
  • apps/webapp/app/routes/otel.v1.logs.ts
  • apps/webapp/app/routes/otel.v1.metrics.ts
  • apps/webapp/app/routes/otel.v1.traces.ts
  • apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.logs.$logId.tsx
  • apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.logs.ts
  • apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.prompts.$promptSlug.generations.ts
  • apps/webapp/app/runEngine/concerns/traceEvents.server.ts
  • apps/webapp/app/runEngine/services/triggerFailedTask.server.ts
  • apps/webapp/app/services/admin/missingLlmModels.server.ts
  • apps/webapp/app/services/clickhouse/clickhouseFactory.server.ts
  • apps/webapp/app/services/clickhouse/clickhouseSecretSchemas.server.ts
  • apps/webapp/app/services/clickhouseInstance.server.ts
  • apps/webapp/app/services/dataStores/organizationDataStoreConfigSchemas.server.ts
  • apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts
  • apps/webapp/app/services/dataStores/organizationDataStoresRegistryInstance.server.ts
  • apps/webapp/app/services/queryService.server.ts
  • apps/webapp/app/services/runsReplicationInstance.server.ts
  • apps/webapp/app/services/runsReplicationService.server.ts
  • apps/webapp/app/v3/eventRepository/clickhouseEventRepository.server.ts
  • apps/webapp/app/v3/eventRepository/clickhouseEventRepositoryInstance.server.ts
  • apps/webapp/app/v3/eventRepository/eventRepository.server.ts
  • apps/webapp/app/v3/eventRepository/eventRepository.types.ts
  • apps/webapp/app/v3/eventRepository/index.server.ts
  • apps/webapp/app/v3/otlpExporter.server.ts
  • apps/webapp/app/v3/services/alerts/errorAlertEvaluator.server.ts
  • apps/webapp/app/v3/services/bulk/BulkActionV2.server.ts
  • apps/webapp/test/clickhouseFactory.test.ts
  • apps/webapp/test/organizationDataStoresRegistry.test.ts
  • apps/webapp/test/runsBackfiller.test.ts
  • apps/webapp/test/runsReplicationBenchmark.test.ts
  • apps/webapp/test/runsReplicationService.part1.test.ts
  • apps/webapp/test/runsReplicationService.part2.test.ts
  • apps/webapp/test/utils/replicationUtils.ts
  • apps/webapp/test/utils/testReplicationClickhouseFactory.ts
  • apps/webapp/test/utils/tracing.ts
  • internal-packages/database/prisma/migrations/20260331212308_add_organization_data_stores/migration.sql
  • internal-packages/database/prisma/schema.prisma
💤 Files with no reviewable changes (2)
  • apps/webapp/app/v3/eventRepository/clickhouseEventRepositoryInstance.server.ts
  • apps/webapp/app/services/clickhouseInstance.server.ts

Comment thread apps/webapp/app/env.server.ts Outdated
Comment on lines +1304 to +1305
// Organization data stores registry
ORGANIZATION_DATA_STORES_RELOAD_INTERVAL_MS: z.coerce.number().int().default(5 * 60 * 1000), // 5 minutes
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate reload interval bounds to prevent tight-loop reloads.

Line 1305 currently allows 0 or negative values. A bad env value can make the registry reload loop run too frequently and degrade service stability.

Suggested fix
-    ORGANIZATION_DATA_STORES_RELOAD_INTERVAL_MS: z.coerce.number().int().default(5 * 60 * 1000), // 5 minutes
+    ORGANIZATION_DATA_STORES_RELOAD_INTERVAL_MS: z.coerce
+      .number()
+      .int()
+      .min(1_000)
+      .default(5 * 60 * 1000), // 5 minutes
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Organization data stores registry
ORGANIZATION_DATA_STORES_RELOAD_INTERVAL_MS: z.coerce.number().int().default(5 * 60 * 1000), // 5 minutes
// Organization data stores registry
ORGANIZATION_DATA_STORES_RELOAD_INTERVAL_MS: z.coerce
.number()
.int()
.min(1_000)
.default(5 * 60 * 1000), // 5 minutes
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/env.server.ts` around lines 1304 - 1305, The env schema
allows ORGANIZATION_DATA_STORES_RELOAD_INTERVAL_MS to be zero or negative,
enabling tight-loop reloads; update the zod schema for
ORGANIZATION_DATA_STORES_RELOAD_INTERVAL_MS to enforce sensible bounds (e.g.,
.int().min(60000) to require at least 1 minute and optionally .max(86400000) to
cap at 24 hours) while keeping the existing default (5 * 60 * 1000); use
z.coerce.number().int().min(...).max(...).default(...) or a .refine validator if
you prefer custom logic so invalid env values fail validation.

Comment on lines +248 to +253
const clickhouseClient = await clickhouseFactory.getClickhouseForOrganization(
environment.organizationId,
"query"
);

const presenter = new ErrorGroupPresenter($replica, clickhouseClient, clickhouseClient);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Pass a logs-scoped client into the presenter's second ClickHouse slot.

Line 253 currently injects the "query" client twice. That keeps the error-side queries working, but any org that routes logs to a separate datastore will still read logs from the wrong backend on this page.

Suggested fix
-  const clickhouseClient = await clickhouseFactory.getClickhouseForOrganization(
-    environment.organizationId,
-    "query"
-  );
-
-  const presenter = new ErrorGroupPresenter($replica, clickhouseClient, clickhouseClient);
+  const [queryClickhouseClient, logsClickhouseClient] = await Promise.all([
+    clickhouseFactory.getClickhouseForOrganization(environment.organizationId, "query"),
+    clickhouseFactory.getClickhouseForOrganization(environment.organizationId, "logs"),
+  ]);
+
+  const presenter = new ErrorGroupPresenter(
+    $replica,
+    queryClickhouseClient,
+    logsClickhouseClient
+  );
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const clickhouseClient = await clickhouseFactory.getClickhouseForOrganization(
environment.organizationId,
"query"
);
const presenter = new ErrorGroupPresenter($replica, clickhouseClient, clickhouseClient);
const [queryClickhouseClient, logsClickhouseClient] = await Promise.all([
clickhouseFactory.getClickhouseForOrganization(environment.organizationId, "query"),
clickhouseFactory.getClickhouseForOrganization(environment.organizationId, "logs"),
]);
const presenter = new ErrorGroupPresenter(
$replica,
queryClickhouseClient,
logsClickhouseClient
);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/webapp/app/routes/_app.orgs`.$organizationSlug.projects.$projectParam.env.$envParam.errors.$fingerprint/route.tsx
around lines 248 - 253, The presenter is being given the same ClickHouse "query"
client twice; call clickhouseFactory.getClickhouseForOrganization a second time
with the logs-scoped role (e.g., "logs") to obtain a logs client and pass that
as the presenter’s third argument so ErrorGroupPresenter($replica, queryClient,
logsClient) receives a query-scoped client and a logs-scoped client respectively
(identify the factory call clickhouseFactory.getClickhouseForOrganization and
the constructor ErrorGroupPresenter to locate where to change).

Comment on lines +71 to +72
const clickhouse = await clickhouseFactory.getClickhouseForOrganization(project.organizationId, "standard");
const presenter = new ModelRegistryPresenter(clickhouse);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

presenter=$(fd 'ModelRegistryPresenter.server.ts' | head -n1)
factory=$(fd 'clickhouseFactory.server.ts' | head -n1)

echo "== ModelRegistryPresenter queries =="
rg -n -C4 'getModelDetail|getUserMetrics|getModelComparison|llm_models|llm_metrics' "$presenter"

echo
echo "== ClickHouse factory resolution =="
rg -n -C4 'getClickhouseForOrganization|OrganizationDataStore|standard' "$factory"

Repository: triggerdotdev/trigger.dev

Length of output: 4487


🏁 Script executed:

fd 'ModelRegistryPresenter.server.ts' -x wc -l {}

Repository: triggerdotdev/trigger.dev

Length of output: 137


🏁 Script executed:

fd 'ModelRegistryPresenter.server.ts' -x head -n 50 {}

Repository: triggerdotdev/trigger.dev

Length of output: 1982


🏁 Script executed:

fd 'route.tsx' -path '*projects*$modelId*' -x cat -n {}

Repository: triggerdotdev/trigger.dev

Length of output: 237


🏁 Script executed:

rg -n -A15 'constructor\(' apps/webapp/app/presenters/v3/ModelRegistryPresenter.server.ts | head -n 30

Repository: triggerdotdev/trigger.dev

Length of output: 661


🏁 Script executed:

fd 'ModelRegistryPresenter' --type f -path '*$modelId*'

Repository: triggerdotdev/trigger.dev

Length of output: 237


🏁 Script executed:

rg -l '\$modelId' apps/webapp/app/routes | head -n 5

Repository: triggerdotdev/trigger.dev

Length of output: 51


🏁 Script executed:

rg -l 'ModelRegistryPresenter' apps/webapp/app/routes --type ts

Repository: triggerdotdev/trigger.dev

Length of output: 407


🏁 Script executed:

rg -B5 -A10 'new ModelRegistryPresenter' apps/webapp/app/routes

Repository: triggerdotdev/trigger.dev

Length of output: 7330


🏁 Script executed:

rg -B5 'this\._replica' apps/webapp/app/presenters/v3/ModelRegistryPresenter.server.ts | head -n 20

Repository: triggerdotdev/trigger.dev

Length of output: 506


🏁 Script executed:

rg -B5 'class BasePresenter' apps/webapp/app/presenters/v3/basePresenter.server.ts | head -n 15

Repository: triggerdotdev/trigger.dev

Length of output: 386


🏁 Script executed:

rg -A20 'class BasePresenter' apps/webapp/app/presenters/v3/basePresenter.server.ts

Repository: triggerdotdev/trigger.dev

Length of output: 754


Pass the Prisma replica client when instantiating ModelRegistryPresenter.

The presenter requires both Prisma (for getModelDetail()) and ClickHouse (for getUserMetrics()). Currently instantiated with only the ClickHouse client, any call to getModelDetail() will crash when accessing this._replica.llmModel.findFirst(...).

Pass $replica as the second argument:

Diff
  const clickhouse = await clickhouseFactory.getClickhouseForOrganization(project.organizationId, "standard");
- const presenter = new ModelRegistryPresenter(clickhouse);
+ const presenter = new ModelRegistryPresenter(clickhouse, $replica);
  const model = await presenter.getModelDetail(modelId);

Import $replica from ~/db.server if not already present.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/webapp/app/routes/_app.orgs`.$organizationSlug.projects.$projectParam.env.$envParam.models.$modelId/route.tsx
around lines 71 - 72, The ModelRegistryPresenter is being instantiated with only
ClickHouse but it needs the Prisma replica client for methods like
getModelDetail(); update the instantiation where you call new
ModelRegistryPresenter(clickhouse) to pass the Prisma replica (named $replica)
as the second argument, ensure you import $replica from ~/db.server if not
already imported, and verify presenter methods (e.g., getModelDetail,
getUserMetrics) now use the provided clients.

Comment on lines +105 to +106
const clickhouse = await clickhouseFactory.getClickhouseForOrganization(project.organizationId, "standard");
const presenter = new TestTaskPresenter($replica, clickhouse);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Move the factory lookup back under the loader's try.

getClickhouseForOrganization() can now fail before presenter.call() runs. With it outside the try, those failures bypass the current logging + redirect path and surface as a 500 instead.

Proposed fix
-  const clickhouse = await clickhouseFactory.getClickhouseForOrganization(project.organizationId, "standard");
-  const presenter = new TestTaskPresenter($replica, clickhouse);
   try {
+    const clickhouse = await clickhouseFactory.getClickhouseForOrganization(
+      project.organizationId,
+      "standard"
+    );
+    const presenter = new TestTaskPresenter($replica, clickhouse);
     const [result, regionsResult] = await Promise.all([
       presenter.call({
         userId: user.id,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@apps/webapp/app/routes/_app.orgs`.$organizationSlug.projects.$projectParam.env.$envParam.test.tasks.$taskParam/route.tsx
around lines 105 - 106, Move the call to
clickhouseFactory.getClickhouseForOrganization(...) so it executes inside the
loader's existing try block before constructing the TestTaskPresenter;
specifically ensure the clickhouse lookup and assignment to clickhouse happen
inside the try that wraps presenter.call(), then instantiate presenter = new
TestTaskPresenter($replica, clickhouse) after the successful lookup so any
errors from getClickhouseForOrganization are caught, logged, and handled by the
existing redirect/path in the catch.

@@ -0,0 +1,400 @@
import { useState } from "react";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

cd apps/webapp && find . -name "admin.data-stores.tsx" -type f

Repository: triggerdotdev/trigger.dev

Length of output: 103


🏁 Script executed:

cat -n apps/webapp/app/routes/admin.data-stores.tsx

Repository: triggerdotdev/trigger.dev

Length of output: 16350


Move the close-on-success update out of render.

Calling onOpenChange(false) during render (lines 307-310) triggers a render-phase state update on the parent. React will warn about this, and it gets noisy in Strict Mode.

🔧 Suggested fix
-import { useState } from "react";
+import { useEffect, useState } from "react";
...
-  // Close dialog on success
-  if (fetcher.data?.success && open) {
-    onOpenChange(false);
-  }
+  // Close dialog on success
+  useEffect(() => {
+    if (fetcher.data?.success && open) {
+      onOpenChange(false);
+    }
+  }, [fetcher.data?.success, open, onOpenChange]);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import { useState } from "react";
import { useEffect, useState } from "react";
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/routes/admin.data-stores.tsx` at line 1, The component
currently calls onOpenChange(false) during render when an operation reports
success (the direct call to onOpenChange in the render/JSX); move this
side-effect into a useEffect so it runs after render instead: remove the
onOpenChange(false) call from the render/JSX, add a useEffect that watches the
isSuccess (or equivalent success flag) and calls onOpenChange(false) when it
becomes true, and include onOpenChange and the success flag in the effect
dependency array to avoid stale closures. Ensure you reference the same success
state variable and the onOpenChange prop used in the component (replace the
inline call with the new effect).

Comment on lines +72 to +75
for (const orgId of row.organizationIds) {
const key = `${orgId}:${row.kind}`;
lookup.set(key, parsed);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fail fast on overlapping org assignments.

lookup.set() silently overwrites any earlier ${orgId}:${row.kind} entry. Because findMany() is unordered here, an org that appears in two stores can end up with a nondeterministic effective route depending on row order.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts`
around lines 72 - 75, The loop that builds the lookup map silently overwrites
keys produced from `${orgId}:${row.kind}`, causing nondeterministic behavior
when an org is assigned to multiple stores; before calling lookup.set(key,
parsed) check lookup.has(key) and if present throw a clear error (or assert)
that includes the conflicting orgId, kind, and identifiers for both the existing
entry and the current row (e.g., existingParsed/id and parsed/id) so overlapping
assignments discovered during findMany() fail fast and surface which rows/IDs
conflict.

Comment on lines +94 to +157
async addDataStore({
key,
kind,
organizationIds,
config,
}: {
key: string;
kind: DataStoreKind;
organizationIds: string[];
config: any;
}) {
const secretKey = this.#secretKey(key, kind);

const secretStore = getSecretStore("DATABASE", { prismaClient: this._prisma });
await secretStore.setSecret(secretKey, config);

return this._prisma.organizationDataStore.create({
data: {
key,
organizationIds,
kind: "CLICKHOUSE",
config: { version: 1, data: { secretKey } },
},
});
}

async updateDataStore({
key,
kind,
organizationIds,
config,
}: {
key: string;
kind: DataStoreKind;
organizationIds: string[];
config?: any;
}) {
const secretKey = this.#secretKey(key, kind);

if (config) {
const secretStore = getSecretStore("DATABASE", { prismaClient: this._prisma });
await secretStore.setSecret(secretKey, config);
}

return this._prisma.organizationDataStore.update({
where: {
key,
},
data: {
organizationIds,
kind: "CLICKHOUSE",
},
});
}

async deleteDataStore({ key, kind }: { key: string; kind: DataStoreKind }) {
const secretKey = this.#secretKey(key, kind);
const secretStore = getSecretStore("DATABASE", { prismaClient: this._prisma });
await secretStore.deleteSecret(secretKey).catch(() => {
// Secret may not exist — proceed with deletion
});

await this._prisma.organizationDataStore.delete({ where: { key } });
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Refresh _lookup after mutations.

get() only reads the in-memory map, but these mutation methods only write to the database. On the singleton registry that means routing stays stale until the next background reload, so OTLP/replication can keep targeting the old ClickHouse instance after an admin change.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts`
around lines 94 - 157, The addDataStore, updateDataStore and deleteDataStore
methods only write to the DB but don't update the in-memory registry map
(_lookup), leaving routing stale; after each successful mutation (after the
secret write and after the prisma create/update/delete) call the registry's
method that repopulates the in-memory map used by get() (the same function the
background loader uses — e.g. this.reload() / this._reloadLookup() /
this._refreshLookup(), whichever exists) so _lookup is refreshed immediately and
subsequent get() calls see the new data.

Comment on lines +278 to +279
const queryClickhouse = await clickhouseFactory.getClickhouseForOrganization(organizationId, "query");
const result = await executeTSQL(queryClickhouse.reader, {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Key the concurrency limiter by organizationId now that the backend is org-scoped.

Line 278 routes execution through an organization-scoped ClickHouse client, but the limiter in Lines 155-160 and 376-379 still acquires/releases by projectId. One organization with multiple projects can therefore exceed the advertised org limit and push more concurrent queries onto the same backend than intended.

⚠️ Proposed fix
   const acquireResult = await queryConcurrencyLimiter.acquire({
-    key: projectId,
+    key: organizationId,
     requestId,
     keyLimit: orgLimit,
     globalLimit: GLOBAL_CONCURRENCY_LIMIT,
   });

   if (!acquireResult.success) {
     const errorMessage =
       acquireResult.reason === "key_limit"
-        ? `You've exceeded your query concurrency of ${orgLimit} for this project. Please try again later.`
+        ? `You've exceeded your query concurrency of ${orgLimit} for this organization. Please try again later.`
         : "We're experiencing a lot of queries at the moment. Please try again later.";
     return { success: false, error: new QueryError(errorMessage, { query: options.query }) };
   }
@@
   } finally {
     // Always release the concurrency slot
     await queryConcurrencyLimiter.release({
-      key: projectId,
+      key: organizationId,
       requestId,
     });
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/services/queryService.server.ts` around lines 278 - 279, The
concurrency limiter is currently keyed by projectId but the ClickHouse client
and execution are organization-scoped (see
clickhouseFactory.getClickhouseForOrganization and executeTSQL), so update the
limiter acquire/release calls (where limiter.acquire and limiter.release are
used around query execution) to use organizationId instead of projectId so
concurrent queries are limited per organization rather than per project; ensure
any helper that constructs limiter keys or caches limiter instances uses
organizationId consistently in the same places referenced around the query
execution.

Comment on lines +655 to +685
for (const [clickhouse, group] of groups) {
sortTaskRunInserts(group.taskRunInserts);
sortPayloadInserts(group.payloadInserts);
combinedTaskRunInserts.push(...group.taskRunInserts);
combinedPayloadInserts.push(...group.payloadInserts);

const [trErr] = await this.#insertWithRetry(
(attempt) => this.#insertTaskRunInserts(clickhouse, group.taskRunInserts, attempt),
"task run inserts",
flushId
);
if (trErr && !taskRunError) {
taskRunError = trErr;
}

const [plErr] = await this.#insertWithRetry(
(attempt) => this.#insertPayloadInserts(clickhouse, group.payloadInserts, attempt),
"payload inserts",
flushId
);
if (plErr && !payloadError) {
payloadError = plErr;
}

if (!trErr) {
this._taskRunsInsertedCounter.add(group.taskRunInserts.length);
}
if (!plErr) {
this._payloadsInsertedCounter.add(group.payloadInserts.length);
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid serializing the per-ClickHouse-group flushes.

Each await here makes every org group wait behind the slowest ClickHouse target. If one org-specific service is unhealthy and spends a few seconds in retry/backoff, unrelated orgs in the same replication batch inherit that lag too. On this hot path, the groups should be flushed behind a bounded limiter instead of one-by-one.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/services/runsReplicationService.server.ts` around lines 655 -
685, The current loop serializes inserts per ClickHouse group because each await
blocks the next; change it to run per-group flushes concurrently with a bounded
concurrency limiter (e.g., p-limit or an internal semaphore) so slow/unhealthy
ClickHouse targets don't stall unrelated orgs. For each group (the items handled
around sortTaskRunInserts, sortPayloadInserts,
combinedTaskRunInserts/combinedPayloadInserts and using this.#insertWithRetry
which calls this.#insertTaskRunInserts and this.#insertPayloadInserts with
flushId), create an async task that performs the two insertWithRetry calls,
records the first taskRunError/payloadError seen, and updates
_taskRunsInsertedCounter/_payloadsInsertedCounter when each insert succeeds;
schedule those tasks through the limiter and await Promise.all on the scheduled
tasks; ensure you keep pushing group.taskRunInserts and group.payloadInserts
into combined* before scheduling so semantics remain the same.

Comment on lines +1207 to +1215
async function initializeOTLPExporter() {
await clickhouseFactory.isReady();
return new OTLPExporter({
clickhouseFactory,
verbose: process.env.OTLP_EXPORTER_VERBOSE === "1",
spanAttributeValueLengthLimit: process.env.SERVER_OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT
? parseInt(process.env.SERVER_OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT, 10)
: 8192
);
} No newline at end of file
: 8192,
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use env for these new OTLP exporter settings.

This initializer now bypasses the webapp's typed env surface even though env is already imported in the file. Please route both settings through env instead of process.env.

As per coding guidelines, "Environment variables must be accessed via the env export from app/env.server.ts and never use process.env directly."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/v3/otlpExporter.server.ts` around lines 1207 - 1215, The
initializer initializeOTLPExporter is reading settings directly from
process.env; change it to use the typed env export instead: replace
process.env.OTLP_EXPORTER_VERBOSE with env.OTLP_EXPORTER_VERBOSE (coerce to "1"
check as before) and replace
process.env.SERVER_OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT with
env.SERVER_OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT and parseInt that value,
falling back to 8192; ensure the existing env import is used (or add it if
missing) and keep clickhouseFactory and OTLPExporter usage unchanged.

deepshekhardas pushed a commit to deepshekhardas/trigger.dev that referenced this pull request Apr 7, 2026
@matt-aitken matt-aitken force-pushed the org-scoped-clickhouse branch from a60f2cb to 6179dc7 Compare April 15, 2026 14:02
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
apps/webapp/app/routes/admin.api.v1.runs-replication.create.ts (1)

13-27: ⚠️ Potential issue | 🟠 Major

Don't silently drop the connection-tuning params.

keepAliveEnabled, keepAliveIdleSocketTtl, and maxOpenConnections are still required by the request schema, but this factory no longer passes them into RunsReplicationService. That makes the admin API accept values it doesn't honor and can unexpectedly revert replication behavior to defaults.

Proposed fix
 function createRunReplicationService(params: CreateRunReplicationServiceParams) {
   const {
     name,
+    keepAliveEnabled,
+    keepAliveIdleSocketTtl,
+    maxOpenConnections,
     maxFlushConcurrency,
     flushIntervalMs,
     flushBatchSize,
     leaderLockTimeoutMs,
     leaderLockExtendIntervalMs,
@@
   const service = new RunsReplicationService({
     clickhouseFactory,
     pgConnectionUrl: env.DATABASE_URL,
     serviceName: name,
+    keepAliveEnabled,
+    keepAliveIdleSocketTtl,
+    maxOpenConnections,
     slotName: env.RUN_REPLICATION_SLOT_NAME,
     publicationName: env.RUN_REPLICATION_PUBLICATION_NAME,

Also applies to: 82-120

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/routes/admin.api.v1.runs-replication.create.ts` around lines
13 - 27, The request schema (CreateRunReplicationServiceParams) requires
connection-tuning params keepAliveEnabled, keepAliveIdleSocketTtl, and
maxOpenConnections but the factory that constructs RunsReplicationService is not
forwarding them, so the API accepts but ignores these values; update the factory
that calls RunsReplicationService (the runs replication service
constructor/factory) to include these three fields when building the service
config/object (pass keepAliveEnabled, keepAliveIdleSocketTtl, maxOpenConnections
from CreateRunReplicationServiceParams into RunsReplicationService), and ensure
any defaulting/validation logic remains consistent with the schema so values are
honored rather than dropped.
♻️ Duplicate comments (5)
apps/webapp/app/env.server.ts (1)

1304-1309: ⚠️ Potential issue | 🟠 Major

Constrain registry reload interval to avoid tight-loop reloads.

ORGANIZATION_DATA_STORES_RELOAD_INTERVAL_MS currently accepts 0 and negative values. Misconfiguration here can create a hot reload loop and unnecessary service load.

Suggested fix
     ORGANIZATION_DATA_STORES_RELOAD_INTERVAL_MS: z.coerce
       .number()
       .int()
+      .min(60_000)
+      .max(86_400_000)
       .default(60 * 1000), // 1 minute
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/env.server.ts` around lines 1304 - 1309,
ORGANIZATION_DATA_STORES_RELOAD_INTERVAL_MS currently allows 0 or negative
values which can cause tight reload loops; update its Zod schema to enforce a
sensible positive minimum (e.g., replace .int().default(60 * 1000) with
.int().min(1000).default(60 * 1000)) so values below 1000ms are rejected and
configuration cannot set 0 or negative intervals.
apps/webapp/app/routes/admin.data-stores.tsx (3)

305-307: ⚠️ Potential issue | 🟠 Major

Move success-close logic out of render and gate it to a completed submission.

Line 305-307 and Line 375-377 perform state updates during render. This is a React render-phase side effect, and because fetcher.data?.success is sticky, reopening can immediately close again unless you gate close-on-success to a submit→idle transition.

Suggested fix pattern
-import { useState } from "react";
+import { useEffect, useRef, useState } from "react";
...
 function EditButton({ name, organizationIds }: { name: string; organizationIds: string[] }) {
   const [open, setOpen] = useState(false);
   const fetcher = useFetcher<{ success?: boolean; error?: string }>();
+  const wasSubmittingRef = useRef(false);
   const isSubmitting = fetcher.state !== "idle";
-
-  if (fetcher.data?.success && open) {
-    setOpen(false);
-  }
+  useEffect(() => {
+    if (fetcher.state === "submitting") wasSubmittingRef.current = true;
+    if (wasSubmittingRef.current && fetcher.state === "idle") {
+      if (fetcher.data?.success && open) setOpen(false);
+      wasSubmittingRef.current = false;
+    }
+  }, [fetcher.state, fetcher.data?.success, open]);
...
 function AddDataStoreDialog(...) {
   const fetcher = useFetcher<{ success?: boolean; error?: string }>();
+  const wasSubmittingRef = useRef(false);
   const isSubmitting = fetcher.state !== "idle";
-
-  if (fetcher.data?.success && open) {
-    onOpenChange(false);
-  }
+  useEffect(() => {
+    if (fetcher.state === "submitting") wasSubmittingRef.current = true;
+    if (wasSubmittingRef.current && fetcher.state === "idle") {
+      if (fetcher.data?.success && open) onOpenChange(false);
+      wasSubmittingRef.current = false;
+    }
+  }, [fetcher.state, fetcher.data?.success, open, onOpenChange]);
#!/bin/bash
# Locate render-phase close-on-success blocks in this route
rg -nP 'if \(fetcher\.data\?\.success && open\)\s*\{' apps/webapp/app/routes/admin.data-stores.tsx -A2 -B1

Also applies to: 375-377

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/routes/admin.data-stores.tsx` around lines 305 - 307, The
render currently closes the modal by calling setOpen when fetcher.data?.success
&& open inside the component render; move this into a useEffect to avoid
render-phase side effects and to only close after a completed submission
transition (submit→idle). Add a useEffect that watches fetcher.state (and
fetcher.data?.success) and records previous fetcher.state (or checks for
fetcher.state === "idle" && previousState === "submitting") and then calls
setOpen(false) only when fetcher.data?.success is true and the state has just
transitioned to idle; update references to fetcher, open, and setOpen
accordingly (replace the inline if in the render and the duplicate at the other
location).

31-31: ⚠️ Potential issue | 🟡 Minor

Use @trigger.dev/core subpath import for tryCatch.

Line 31 imports from the package root, which breaks the webapp subpath-import rule.

Suggested fix
-import { tryCatch } from "@trigger.dev/core";
+import { tryCatch } from "@trigger.dev/core/utils";
#!/bin/bash
# Verify root imports from `@trigger.dev/core` in this route
rg -nP 'from "@trigger\.dev/core"' apps/webapp/app/routes/admin.data-stores.tsx

As per coding guidelines, "When importing from @trigger.dev/core in the webapp, use subpath exports from the package.json instead of importing from the root path."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/routes/admin.data-stores.tsx` at line 31, Replace the
root-package import of tryCatch with the package subpath export: change the
import line that reads import { tryCatch } from "@trigger.dev/core"; to use the
subpath import (e.g. import { tryCatch } from "@trigger.dev/core/tryCatch";) so
the webapp follows the subpath-import rule; update any nearby references if your
build flags require exact subpath syntax.

96-97: ⚠️ Potential issue | 🟠 Major

Avoid uncaught Zod throws in add/update URL validation paths.

Line 96 and Line 121 use .parse(), which throws before the tryCatch(...) around registry calls. Invalid URL input can therefore bypass your 400 path and surface as a 500.

Suggested fix
-      const config = ClickhouseConnectionSchema.parse({ url: connectionUrl });
+      const parsedConfig = ClickhouseConnectionSchema.safeParse({ url: connectionUrl });
+      if (!parsedConfig.success) {
+        return typedjson(
+          { error: parsedConfig.error.issues.map((i) => i.message).join(", ") },
+          { status: 400 }
+        );
+      }
+      const config = parsedConfig.data;
...
-      const config = connectionUrl
-        ? ClickhouseConnectionSchema.parse({ url: connectionUrl })
-        : undefined;
+      let config: { url: string } | undefined = undefined;
+      if (connectionUrl) {
+        const parsedConfig = ClickhouseConnectionSchema.safeParse({ url: connectionUrl });
+        if (!parsedConfig.success) {
+          return typedjson(
+            { error: parsedConfig.error.issues.map((i) => i.message).join(", ") },
+            { status: 400 }
+          );
+        }
+        config = parsedConfig.data;
+      }
#!/bin/bash
# Verify all direct parse usages in this route
rg -n 'ClickhouseConnectionSchema\.parse\(' apps/webapp/app/routes/admin.data-stores.tsx

Based on learnings, when using Zod safeParse, check .success before accessing .data.

Also applies to: 120-122

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/routes/admin.data-stores.tsx` around lines 96 - 97, Replace
direct Zod throws by using ClickhouseConnectionSchema.safeParse instead of
ClickhouseConnectionSchema.parse in the add/update URL validation paths (the
places currently calling ClickhouseConnectionSchema.parse). Check the returned
result.success and if false return the existing 400 error path (or construct the
same validation error response) instead of letting Zod throw; then use
result.data when success. Ensure validation happens before the tryCatch/registry
calls so invalid input follows the 400 branch rather than bubbling as a 500.
apps/webapp/app/services/clickhouse/clickhouseFactory.server.ts (1)

362-383: ⚠️ Potential issue | 🟠 Major

Apply startTimeMaxAgeMs to the v2 repository too.

The v1 branch passes startTimeMaxAgeMs, but the v2 branch omits it. ClickhouseEventRepository still uses that guard to clamp stale start_time values before inserts, so v2 currently loses the same protection.

🔧 Proposed fix
     case "clickhouse_v2": {
       return new ClickhouseEventRepository({
         clickhouse: clickhouse,
         batchSize: env.EVENTS_CLICKHOUSE_BATCH_SIZE,
         flushInterval: env.EVENTS_CLICKHOUSE_FLUSH_INTERVAL_MS,
         maximumTraceSummaryViewCount: env.EVENTS_CLICKHOUSE_MAX_TRACE_SUMMARY_VIEW_COUNT,
         maximumTraceDetailedSummaryViewCount:
           env.EVENTS_CLICKHOUSE_MAX_TRACE_DETAILED_SUMMARY_VIEW_COUNT,
         maximumLiveReloadingSetting: env.EVENTS_CLICKHOUSE_MAX_LIVE_RELOADING_SETTING,
         insertStrategy: env.EVENTS_CLICKHOUSE_INSERT_STRATEGY,
         waitForAsyncInsert: env.EVENTS_CLICKHOUSE_WAIT_FOR_ASYNC_INSERT === "1",
         asyncInsertMaxDataSize: env.EVENTS_CLICKHOUSE_ASYNC_INSERT_MAX_DATA_SIZE,
         asyncInsertBusyTimeoutMs: env.EVENTS_CLICKHOUSE_ASYNC_INSERT_BUSY_TIMEOUT_MS,
+        startTimeMaxAgeMs: env.EVENTS_CLICKHOUSE_START_TIME_MAX_AGE_MS,
         llmMetricsBatchSize: env.LLM_METRICS_BATCH_SIZE,
         llmMetricsFlushInterval: env.LLM_METRICS_FLUSH_INTERVAL_MS,
         llmMetricsMaxBatchSize: env.LLM_METRICS_MAX_BATCH_SIZE,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/services/clickhouse/clickhouseFactory.server.ts` around lines
362 - 383, The v2 branch constructing ClickhouseEventRepository omits the
startTimeMaxAgeMs option so the repository's clamping of stale start_time is
skipped; add startTimeMaxAgeMs: env.EVENTS_CLICKHOUSE_START_TIME_MAX_AGE_MS (or
the existing env key used by v1) to the options passed in the "clickhouse_v2"
case when creating new ClickhouseEventRepository so v2 receives the same guard
as v1.
🧹 Nitpick comments (1)
apps/webapp/app/v3/eventRepository/index.server.ts (1)

59-72: Minor inefficiency: factory called for all stores including Postgres.

The factory is called unconditionally on line 59-62, but for Postgres stores, the result is discarded and eventRepository is returned on line 72. Consider moving the factory call inside the ClickHouse conditionals to avoid unnecessary work for Postgres-backed organizations.

Suggested optimization
-  const { repository: resolvedRepository } = await clickhouseFactory.getEventRepositoryForOrganization(
-    taskEventStore,
-    organizationId
-  );
-
   if (taskEventStore === EVENT_STORE_TYPES.CLICKHOUSE_V2) {
+    const { repository: resolvedRepository } = await clickhouseFactory.getEventRepositoryForOrganization(
+      taskEventStore,
+      organizationId
+    );
     return { repository: resolvedRepository, store: EVENT_STORE_TYPES.CLICKHOUSE_V2 };
   }

   if (taskEventStore === EVENT_STORE_TYPES.CLICKHOUSE) {
+    const { repository: resolvedRepository } = await clickhouseFactory.getEventRepositoryForOrganization(
+      taskEventStore,
+      organizationId
+    );
     return { repository: resolvedRepository, store: EVENT_STORE_TYPES.CLICKHOUSE };
   }

   return { repository: eventRepository, store: EVENT_STORE_TYPES.POSTGRES };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/v3/eventRepository/index.server.ts` around lines 59 - 72, The
code currently calls clickhouseFactory.getEventRepositoryForOrganization
unconditionally and discards its result for Postgres; move the call inside the
ClickHouse branches so it's only invoked when taskEventStore ===
EVENT_STORE_TYPES.CLICKHOUSE_V2 or === EVENT_STORE_TYPES.CLICKHOUSE. Concretely,
keep the final return { repository: eventRepository, store:
EVENT_STORE_TYPES.POSTGRES } for non-ClickHouse, and in each ClickHouse
conditional call getEventRepositoryForOrganization to obtain { repository:
resolvedRepository } and return { repository: resolvedRepository, store:
EVENT_STORE_TYPES.CLICKHOUSE[_V2] } accordingly; ensure you still use the same
symbols: clickhouseFactory.getEventRepositoryForOrganization, taskEventStore,
EVENT_STORE_TYPES, eventRepository, and resolvedRepository.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/webapp/app/routes/resources.runs`.$runParam.logs.download.ts:
- Around line 36-39: The code currently calls
resolveEventRepositoryForStore(run.taskEventStore, run.organizationId ?? "")
which masks missing TaskRun.organizationId and can route to a default repo;
change this to explicitly check run.organizationId (e.g., if
(!run.organizationId) throw new Error or return an appropriate HTTP error)
before calling resolveEventRepositoryForStore so tenant-scoped resolution fails
fast when organizationId is null/undefined; update the call sites that use
resolveEventRepositoryForStore and references to
run.taskEventStore/run.organizationId to follow the same explicit-null-check
pattern.

In `@apps/webapp/app/services/clickhouse/clickhouseFactory.server.ts`:
- Around line 164-178: The org-scoped builder buildOrgClickhouseClient currently
applies only the generic CLICKHOUSE_* env settings; update it to branch on the
clientType ("events", "replication", "logs", etc.) and apply the per-client
tuning (use EVENTS_CLICKHOUSE_* for events clients for
pool/compression/maxOpenConnections, use RUN_REPLICATION_* for replication
clients, and include the same clickhouseSettings/safety caps used by
initializeLogsClickhouseClient for logs). Locate buildOrgClickhouseClient and
add conditional logic to select the appropriate env vars and clickhouse options
(keepAlive, compression, maxOpenConnections, and clickhouseSettings) based on
clientType so org-scoped clients match the non-org default behavior.
- Around line 157-159: The cache key uses hashHostname which only hashes
parsed.hostname causing collisions across different org stores on the same host;
update hashHostname (or replace with a new helper used by _clientCache and
_eventRepositoryCache) to hash the full datastore identity (e.g., the complete
URL including origin, pathname and search/query, or a canonical JSON of
connection options such as host, database, credentials and path) instead of only
parsed.hostname so each tenant+database+params yields a unique key; ensure all
places that call hashHostname (including usages tied to _clientCache and
_eventRepositoryCache) are updated to use the new full-identity hash.

In `@apps/webapp/app/v3/runEngineHandlers.server.ts`:
- Around line 190-193: The code calls
resolveEventRepositoryForStore(run.taskEventStore, taskRun.organizationId ?? "")
which uses an empty string org ID when organizationId is missing; change this to
validate taskRun.organizationId (and any similar call sites such as the other
resolveEventRepositoryForStore usages) and bail out or throw if it's falsy
instead of passing "" — e.g., check taskRun.organizationId before calling
resolveEventRepositoryForStore, return an error/early exit (or propagate) when
missing, and only call resolveEventRepositoryForStore with the validated
non-empty organizationId to avoid writing events to the wrong backend.

In `@apps/webapp/app/v3/services/completeAttempt.server.ts`:
- Around line 166-169: The current calls to resolveEventRepositoryForStore (and
the two other repository resolution calls at the success, failure, and retry
paths) use a fallback organizationId of "" via
taskRunAttempt.taskRun.organizationId ?? "", which silently masks missing tenant
IDs; change each to explicitly guard for a missing organizationId on
taskRunAttempt.taskRun and throw or return an explicit error instead of using ""
— locate usages of resolveEventRepositoryForStore and the analogous repository
resolution functions near the success/failure/retry paths (also repeated around
the other occurrences) and replace the fallback with a check like if
(!taskRunAttempt.taskRun.organizationId) { throw new Error("Missing
organizationId for taskRunAttempt"); } before calling the resolver so the
resolver always receives a real organizationId.

In `@apps/webapp/test/clickhouseFactory.test.ts`:
- Around line 1-3: The test imports only describe and expect from vitest but
uses vi (e.g., vi.mock and vi.setConfig); update the import statement in
clickhouseFactory.test.ts to also import vi from "vitest" so calls like
vi.mock(...) and vi.setConfig(...) reference the vitest mock API correctly.

---

Outside diff comments:
In `@apps/webapp/app/routes/admin.api.v1.runs-replication.create.ts`:
- Around line 13-27: The request schema (CreateRunReplicationServiceParams)
requires connection-tuning params keepAliveEnabled, keepAliveIdleSocketTtl, and
maxOpenConnections but the factory that constructs RunsReplicationService is not
forwarding them, so the API accepts but ignores these values; update the factory
that calls RunsReplicationService (the runs replication service
constructor/factory) to include these three fields when building the service
config/object (pass keepAliveEnabled, keepAliveIdleSocketTtl, maxOpenConnections
from CreateRunReplicationServiceParams into RunsReplicationService), and ensure
any defaulting/validation logic remains consistent with the schema so values are
honored rather than dropped.

---

Duplicate comments:
In `@apps/webapp/app/env.server.ts`:
- Around line 1304-1309: ORGANIZATION_DATA_STORES_RELOAD_INTERVAL_MS currently
allows 0 or negative values which can cause tight reload loops; update its Zod
schema to enforce a sensible positive minimum (e.g., replace .int().default(60 *
1000) with .int().min(1000).default(60 * 1000)) so values below 1000ms are
rejected and configuration cannot set 0 or negative intervals.

In `@apps/webapp/app/routes/admin.data-stores.tsx`:
- Around line 305-307: The render currently closes the modal by calling setOpen
when fetcher.data?.success && open inside the component render; move this into a
useEffect to avoid render-phase side effects and to only close after a completed
submission transition (submit→idle). Add a useEffect that watches fetcher.state
(and fetcher.data?.success) and records previous fetcher.state (or checks for
fetcher.state === "idle" && previousState === "submitting") and then calls
setOpen(false) only when fetcher.data?.success is true and the state has just
transitioned to idle; update references to fetcher, open, and setOpen
accordingly (replace the inline if in the render and the duplicate at the other
location).
- Line 31: Replace the root-package import of tryCatch with the package subpath
export: change the import line that reads import { tryCatch } from
"@trigger.dev/core"; to use the subpath import (e.g. import { tryCatch } from
"@trigger.dev/core/tryCatch";) so the webapp follows the subpath-import rule;
update any nearby references if your build flags require exact subpath syntax.
- Around line 96-97: Replace direct Zod throws by using
ClickhouseConnectionSchema.safeParse instead of ClickhouseConnectionSchema.parse
in the add/update URL validation paths (the places currently calling
ClickhouseConnectionSchema.parse). Check the returned result.success and if
false return the existing 400 error path (or construct the same validation error
response) instead of letting Zod throw; then use result.data when success.
Ensure validation happens before the tryCatch/registry calls so invalid input
follows the 400 branch rather than bubbling as a 500.

In `@apps/webapp/app/services/clickhouse/clickhouseFactory.server.ts`:
- Around line 362-383: The v2 branch constructing ClickhouseEventRepository
omits the startTimeMaxAgeMs option so the repository's clamping of stale
start_time is skipped; add startTimeMaxAgeMs:
env.EVENTS_CLICKHOUSE_START_TIME_MAX_AGE_MS (or the existing env key used by v1)
to the options passed in the "clickhouse_v2" case when creating new
ClickhouseEventRepository so v2 receives the same guard as v1.

---

Nitpick comments:
In `@apps/webapp/app/v3/eventRepository/index.server.ts`:
- Around line 59-72: The code currently calls
clickhouseFactory.getEventRepositoryForOrganization unconditionally and discards
its result for Postgres; move the call inside the ClickHouse branches so it's
only invoked when taskEventStore === EVENT_STORE_TYPES.CLICKHOUSE_V2 or ===
EVENT_STORE_TYPES.CLICKHOUSE. Concretely, keep the final return { repository:
eventRepository, store: EVENT_STORE_TYPES.POSTGRES } for non-ClickHouse, and in
each ClickHouse conditional call getEventRepositoryForOrganization to obtain {
repository: resolvedRepository } and return { repository: resolvedRepository,
store: EVENT_STORE_TYPES.CLICKHOUSE[_V2] } accordingly; ensure you still use the
same symbols: clickhouseFactory.getEventRepositoryForOrganization,
taskEventStore, EVENT_STORE_TYPES, eventRepository, and resolvedRepository.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 8be3f360-595a-4400-b4e5-adcbb4c08b8f

📥 Commits

Reviewing files that changed from the base of the PR and between a60f2cb and 6179dc7.

📒 Files selected for processing (79)
  • .cursor/mcp.json
  • .server-changes/organization-scoped-clickhouse.md
  • CLAUDE.md
  • apps/webapp/app/env.server.ts
  • apps/webapp/app/presenters/v3/ApiRunListPresenter.server.ts
  • apps/webapp/app/presenters/v3/CreateBulkActionPresenter.server.ts
  • apps/webapp/app/presenters/v3/RunPresenter.server.ts
  • apps/webapp/app/presenters/v3/RunTagListPresenter.server.ts
  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • apps/webapp/app/presenters/v3/TaskListPresenter.server.ts
  • apps/webapp/app/presenters/v3/UsagePresenter.server.ts
  • apps/webapp/app/presenters/v3/ViewSchedulePresenter.server.ts
  • apps/webapp/app/presenters/v3/WaitpointPresenter.server.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.dashboards.$dashboardKey/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.errors.$fingerprint/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.errors._index/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.logs/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.models.$modelId/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.models._index/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.models.compare/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.prompts.$promptSlug/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.prompts._index/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.runs.$runParam/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.runs._index/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.test.tasks.$taskParam/route.tsx
  • apps/webapp/app/routes/admin.api.v1.runs-replication.create.ts
  • apps/webapp/app/routes/admin.api.v1.runs-replication.start.ts
  • apps/webapp/app/routes/admin.data-stores.tsx
  • apps/webapp/app/routes/admin.tsx
  • apps/webapp/app/routes/api.v1.prompts.$slug.ts
  • apps/webapp/app/routes/api.v1.prompts.$slug.versions.ts
  • apps/webapp/app/routes/api.v1.prompts._index.ts
  • apps/webapp/app/routes/api.v1.runs.$runId.events.ts
  • apps/webapp/app/routes/api.v1.runs.$runId.spans.$spanId.ts
  • apps/webapp/app/routes/api.v1.runs.$runId.trace.ts
  • apps/webapp/app/routes/otel.v1.logs.ts
  • apps/webapp/app/routes/otel.v1.metrics.ts
  • apps/webapp/app/routes/otel.v1.traces.ts
  • apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.logs.$logId.tsx
  • apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.logs.ts
  • apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.prompts.$promptSlug.generations.ts
  • apps/webapp/app/routes/resources.runs.$runParam.logs.download.ts
  • apps/webapp/app/runEngine/concerns/traceEvents.server.ts
  • apps/webapp/app/runEngine/services/triggerFailedTask.server.ts
  • apps/webapp/app/services/admin/missingLlmModels.server.ts
  • apps/webapp/app/services/clickhouse/clickhouseFactory.server.ts
  • apps/webapp/app/services/clickhouse/clickhouseSecretSchemas.server.ts
  • apps/webapp/app/services/clickhouseInstance.server.ts
  • apps/webapp/app/services/dataStores/organizationDataStoreConfigSchemas.server.ts
  • apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts
  • apps/webapp/app/services/dataStores/organizationDataStoresRegistryInstance.server.ts
  • apps/webapp/app/services/queryService.server.ts
  • apps/webapp/app/services/runsReplicationInstance.server.ts
  • apps/webapp/app/services/runsReplicationService.server.ts
  • apps/webapp/app/v3/eventRepository/clickhouseEventRepository.server.ts
  • apps/webapp/app/v3/eventRepository/clickhouseEventRepositoryInstance.server.ts
  • apps/webapp/app/v3/eventRepository/eventRepository.server.ts
  • apps/webapp/app/v3/eventRepository/eventRepository.types.ts
  • apps/webapp/app/v3/eventRepository/index.server.ts
  • apps/webapp/app/v3/otlpExporter.server.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
  • apps/webapp/app/v3/services/alerts/errorAlertEvaluator.server.ts
  • apps/webapp/app/v3/services/bulk/BulkActionV2.server.ts
  • apps/webapp/app/v3/services/cancelTaskRunV1.server.ts
  • apps/webapp/app/v3/services/completeAttempt.server.ts
  • apps/webapp/app/v3/services/crashTaskRun.server.ts
  • apps/webapp/app/v3/services/expireEnqueuedRun.server.ts
  • apps/webapp/app/v3/services/triggerTaskV1.server.ts
  • apps/webapp/test/clickhouseFactory.test.ts
  • apps/webapp/test/organizationDataStoresRegistry.test.ts
  • apps/webapp/test/runsBackfiller.test.ts
  • apps/webapp/test/runsReplicationBenchmark.test.ts
  • apps/webapp/test/runsReplicationService.part1.test.ts
  • apps/webapp/test/runsReplicationService.part2.test.ts
  • apps/webapp/test/utils/replicationUtils.ts
  • apps/webapp/test/utils/testReplicationClickhouseFactory.ts
  • apps/webapp/test/utils/tracing.ts
  • internal-packages/database/prisma/migrations/20260331212308_add_organization_data_stores/migration.sql
  • internal-packages/database/prisma/schema.prisma
💤 Files with no reviewable changes (2)
  • apps/webapp/app/services/clickhouseInstance.server.ts
  • apps/webapp/app/v3/eventRepository/clickhouseEventRepositoryInstance.server.ts
✅ Files skipped from review due to trivial changes (18)
  • .server-changes/organization-scoped-clickhouse.md
  • apps/webapp/app/routes/admin.tsx
  • .cursor/mcp.json
  • CLAUDE.md
  • apps/webapp/app/runEngine/services/triggerFailedTask.server.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.prompts.$promptSlug/route.tsx
  • apps/webapp/app/v3/services/bulk/BulkActionV2.server.ts
  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • apps/webapp/test/runsReplicationService.part1.test.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.test.tasks.$taskParam/route.tsx
  • apps/webapp/test/runsReplicationService.part2.test.ts
  • apps/webapp/app/services/clickhouse/clickhouseSecretSchemas.server.ts
  • apps/webapp/test/organizationDataStoresRegistry.test.ts
  • apps/webapp/app/services/dataStores/organizationDataStoreConfigSchemas.server.ts
  • internal-packages/database/prisma/migrations/20260331212308_add_organization_data_stores/migration.sql
  • internal-packages/database/prisma/schema.prisma
  • apps/webapp/app/v3/otlpExporter.server.ts
  • apps/webapp/app/services/runsReplicationService.server.ts
🚧 Files skipped from review as they are similar to previous changes (27)
  • apps/webapp/app/presenters/v3/CreateBulkActionPresenter.server.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.models.compare/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.models._index/route.tsx
  • apps/webapp/test/runsBackfiller.test.ts
  • apps/webapp/app/presenters/v3/ApiRunListPresenter.server.ts
  • apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.logs.$logId.tsx
  • apps/webapp/app/routes/otel.v1.metrics.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.models.$modelId/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.dashboards.$dashboardKey/route.tsx
  • apps/webapp/test/utils/tracing.ts
  • apps/webapp/app/services/queryService.server.ts
  • apps/webapp/app/routes/otel.v1.logs.ts
  • apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.logs.ts
  • apps/webapp/app/presenters/v3/RunPresenter.server.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.prompts._index/route.tsx
  • apps/webapp/app/routes/otel.v1.traces.ts
  • apps/webapp/app/routes/api.v1.prompts._index.ts
  • apps/webapp/test/runsReplicationBenchmark.test.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.errors._index/route.tsx
  • apps/webapp/app/services/admin/missingLlmModels.server.ts
  • apps/webapp/app/services/runsReplicationInstance.server.ts
  • apps/webapp/app/services/dataStores/organizationDataStoresRegistryInstance.server.ts
  • apps/webapp/app/presenters/v3/TaskListPresenter.server.ts
  • apps/webapp/test/utils/testReplicationClickhouseFactory.ts
  • apps/webapp/app/routes/api.v1.prompts.$slug.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.errors.$fingerprint/route.tsx
  • apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts

Comment on lines +36 to +39
const eventRepository = resolveEventRepositoryForStore(
run.taskEventStore,
run.organizationId ?? ""
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== TaskRun model organizationId definition =="
fd -i 'schema.prisma' --exec sh -c '
  echo "--- $1 ---"
  sed -n "/model TaskRun {/,/}/p" "$1" | rg -n "organizationId|taskEventStore|runtimeEnvironmentId"
' sh {}

echo
echo "== resolveEventRepositoryForStore callsites using empty-string fallback =="
rg -nP --type=ts 'resolveEventRepositoryForStore\(\s*[^,]+,\s*[^)]*\?\?\s*""\s*\)'

echo
echo "== resolveEventRepositoryForStore signature =="
rg -nP --type=ts 'export function resolveEventRepositoryForStore\('

Repository: triggerdotdev/trigger.dev

Length of output: 510


🏁 Script executed:

cat -n apps/webapp/app/routes/resources.runs.$runParam.logs.download.ts | head -60

Repository: triggerdotdev/trigger.dev

Length of output: 156


🏁 Script executed:

rg -nA 5 'export function resolveEventRepositoryForStore' apps/

Repository: triggerdotdev/trigger.dev

Length of output: 699


🏁 Script executed:

rg -nB 5 -A 10 'model TaskRun' internal-packages/database/prisma/schema.prisma

Repository: triggerdotdev/trigger.dev

Length of output: 4886


🏁 Script executed:

rg -n 'organizationId' apps/webapp/app/routes/resources.runs.$runParam.logs.download.ts

Repository: triggerdotdev/trigger.dev

Length of output: 164


🏁 Script executed:

find apps/webapp/app/routes -name "*logs.download*" -type f

Repository: triggerdotdev/trigger.dev

Length of output: 133


🏁 Script executed:

rg -n 'organizationId' internal-packages/database/prisma/schema.prisma | grep -A 2 -B 2 'TaskRun'

Repository: triggerdotdev/trigger.dev

Length of output: 51


🏁 Script executed:

sed -n '669,900p' internal-packages/database/prisma/schema.prisma | grep -n 'organizationId'

Repository: triggerdotdev/trigger.dev

Length of output: 96


🏁 Script executed:

cat -n 'apps/webapp/app/routes/resources.runs.$runParam.logs.download.ts'

Repository: triggerdotdev/trigger.dev

Length of output: 4676


🏁 Script executed:

rg -B 15 -A 5 'resolveEventRepositoryForStore' 'apps/webapp/app/routes/resources.runs.$runParam.logs.download.ts'

Repository: triggerdotdev/trigger.dev

Length of output: 1341


🏁 Script executed:

rg -B 5 -A 20 'taskRun.findFirst' apps/webapp/app/routes/resources.runs.$runParam.logs.download.ts

Repository: triggerdotdev/trigger.dev

Length of output: 164


🏁 Script executed:

# Check if organizationId is a required field in practice (is it set when TaskRun is created?)
rg -n 'organizationId:' internal-packages/database/prisma/schema.prisma | grep -A 5 -B 5 'TaskRun'

Repository: triggerdotdev/trigger.dev

Length of output: 51


🏁 Script executed:

# Check how clickhouseFactory.getEventRepositoryForOrganizationSync handles empty string
rg -B 5 -A 15 'getEventRepositoryForOrganizationSync' apps/

Repository: triggerdotdev/trigger.dev

Length of output: 7990


🏁 Script executed:

# Check TaskRun.organizationId nullability more thoroughly
sed -n '669,800p' internal-packages/database/prisma/schema.prisma | cat -n

Repository: triggerdotdev/trigger.dev

Length of output: 4903


🏁 Script executed:

# Look for how TaskRun is created - does organizationId get set?
rg -n 'create.*TaskRun|prisma.taskRun.create' apps/webapp/app/services/ apps/webapp/app/v3/ | head -20

Repository: triggerdotdev/trigger.dev

Length of output: 1093


🏁 Script executed:

# Check for any validation or cleanup of empty organizationId
rg -n 'organizationId.*""' apps/webapp/

Repository: triggerdotdev/trigger.dev

Length of output: 1704


🏁 Script executed:

# Check if TaskRun.organizationId is always populated during creation
rg -B 10 -A 10 'data.*organizationId' apps/webapp/app/v3/ | grep -A 10 -B 10 'taskRun\|TaskRun' | head -40

Repository: triggerdotdev/trigger.dev

Length of output: 51


🏁 Script executed:

# Check if there are any comments or TODOs about the organizationId fallback
rg -B 3 'organizationId ?? ""' apps/webapp/app/v3/eventRepository/

Repository: triggerdotdev/trigger.dev

Length of output: 51


🏁 Script executed:

# Verify: does the query that fetches the run guarantee organizationId is set?
rg -B 20 'resolveEventRepositoryForStore' apps/webapp/app/routes/resources.runs.$runParam.logs.download.ts | head -40

Repository: triggerdotdev/trigger.dev

Length of output: 164


🏁 Script executed:

# Try a different approach - find the file directly
find apps/webapp -name "logs.download.ts" -type f -exec head -40 {} \;

Repository: triggerdotdev/trigger.dev

Length of output: 51


🏁 Script executed:

# Check if Project-to-Organization relationship guarantees organizationId on TaskRun
rg -B 5 -A 5 'organizationId' internal-packages/database/prisma/schema.prisma | grep -A 10 -B 10 'Project\|Organization' | head -50

Repository: triggerdotdev/trigger.dev

Length of output: 1334


🏁 Script executed:

# Check the query pattern - does fetchning a run through project->organization relationship mean organizationId should exist?
sed -n '17,30p' apps/webapp/app/routes/resources.runs.\$runParam.logs.download.ts

Repository: triggerdotdev/trigger.dev

Length of output: 343


🏁 Script executed:

# Check how TaskRun gets created - specifically if organizationId is set
rg -B 5 -A 15 'prisma\.taskRun\.create' apps/webapp/app/v3/services/ | head -60

Repository: triggerdotdev/trigger.dev

Length of output: 51


🏁 Script executed:

# Check if TaskRun uses projectId to derive organizationId, or if it's stored separately
rg -n 'organizationId.*projectId\|projectId.*organizationId' apps/webapp/app/v3/ | head -10

Repository: triggerdotdev/trigger.dev

Length of output: 51


🏁 Script executed:

# Look at the Project model to understand the relationship
sed -n '/^model Project {/,/^}/p' internal-packages/database/prisma/schema.prisma | head -40

Repository: triggerdotdev/trigger.dev

Length of output: 1563


🏁 Script executed:

# Check if organizationId is always set when TaskRun is created (by looking at actual creation code)
rg -B 20 'taskRun.*create' apps/webapp/app/v3/services/triggerV3.server.ts | head -80

Repository: triggerdotdev/trigger.dev

Length of output: 156


🏁 Script executed:

# Look for any validation logic that checks organizationId after fetching
rg -B 5 -A 10 'if.*organizationId\|organizationId.*if' apps/webapp/app/v3/eventRepository/

Repository: triggerdotdev/trigger.dev

Length of output: 51


🏁 Script executed:

# Check if the query always returns organizationId from the database (by looking at select clauses)
rg -B 5 'taskRun.findFirst.*where' apps/webapp/app/v3/eventRepository/ apps/webapp/app/services/ | head -30

Repository: triggerdotdev/trigger.dev

Length of output: 51


🏁 Script executed:

# Find where TaskRun is created - search for actual create statements
find apps/webapp -name "*.server.ts" -o -name "*.server.tsx" | xargs rg -l 'taskRun.*create|prisma\.taskRun\.create' | head -5

Repository: triggerdotdev/trigger.dev

Length of output: 340


🏁 Script executed:

# Look at one of the create implementations
fd -t f '*.server.ts' apps/webapp/app/v3/services | xargs grep -l 'create' | head -1 | xargs cat | head -100

Repository: triggerdotdev/trigger.dev

Length of output: 359


🏁 Script executed:

# Check if organizationId is populated when TaskRun is created - search for data assignment
rg -B 5 -A 15 'taskRun.*create|create.*taskRun' apps/webapp/app/v3 | grep -A 15 'data.*{' | head -50

Repository: triggerdotdev/trigger.dev

Length of output: 4326


🏁 Script executed:

# Check if organizationId is guaranteed to be set from the query - look at the fetch context
cat -n apps/webapp/app/routes/resources.runs.\$runParam.logs.download.ts | sed -n '17,35p'

Repository: triggerdotdev/trigger.dev

Length of output: 551


🏁 Script executed:

# Now check if environment.organizationId is always present
rg -B 10 'organizationId: environment.organizationId' apps/webapp/app/v3/services/triggerTaskV1.server.ts | head -20

Repository: triggerdotdev/trigger.dev

Length of output: 536


🏁 Script executed:

# Look for any code that might create TaskRun without organizationId
rg 'taskRun.*create|prisma\.taskRun\.create' -A 20 apps/webapp/app/v3/services/ | grep -v 'organizationId' | head -40

Repository: triggerdotdev/trigger.dev

Length of output: 3896


🏁 Script executed:

# Check if the query implicitly guarantees organizationId through the project->organization relationship
# Does Project always have an organizationId?
sed -n '/^model Project {/,/^}/p' internal-packages/database/prisma/schema.prisma | grep -n organizationId

Repository: triggerdotdev/trigger.dev

Length of output: 218


🏁 Script executed:

# Check if clickhouseFactory handles empty string specially or if it causes issues
cat -n apps/webapp/app/services/clickhouse/clickhouseFactory.server.ts | sed -n '60,120p'

Repository: triggerdotdev/trigger.dev

Length of output: 2335


🏁 Script executed:

# See what getEventRepositoryForOrganizationSync does with an empty/invalid organizationId
rg -B 5 -A 20 'getEventRepositoryForOrganizationSync' apps/webapp/app/services/clickhouse/clickhouseFactory.server.ts | head -50

Repository: triggerdotdev/trigger.dev

Length of output: 1117


🏁 Script executed:

# Critical check: Can organizationId actually be NULL at runtime, or is it always set?
# Look for any code paths that create TaskRun without setting organizationId
rg -B 10 'taskRun\.create' apps/webapp/app/v3/ | grep -B 10 'organizationId' | grep -v 'organizationId' | tail -20

Repository: triggerdotdev/trigger.dev

Length of output: 51


🏁 Script executed:

# Check if there are migrations or older code that might create TaskRun without organizationId
git log --oneline --all -- internal-packages/database/prisma/schema.prisma | head -5

Repository: triggerdotdev/trigger.dev

Length of output: 392


🏁 Script executed:

# Final verification: Does the default Prisma select include organizationId?
rg -B 3 'taskRun.findFirst' apps/webapp/app/routes/resources.runs.\$runParam.logs.download.ts

Repository: triggerdotdev/trigger.dev

Length of output: 240


Use explicit null checks for org-scoped repository resolution instead of silent empty-string fallback.

TaskRun.organizationId is optional in the schema (organizationId String?). Using run.organizationId ?? "" masks null values and silently routes to a default repository, risking cross-tenant data leakage. When the org ID is missing, the wrong event store may be accessed without raising an error.

Fail fast on missing org ID for tenant-scoped storage. Replace the fallback with an explicit check:

-  const eventRepository = resolveEventRepositoryForStore(
-    run.taskEventStore,
-    run.organizationId ?? ""
-  );
+  if (!run.organizationId) {
+    return new Response("Run organization is missing", { status: 500 });
+  }
+
+  const eventRepository = resolveEventRepositoryForStore(run.taskEventStore, run.organizationId);

This pattern is also used in apps/webapp/app/v3/runEngineHandlers.server.ts, apps/webapp/app/v3/services/completeAttempt.server.ts, and apps/webapp/app/v3/eventRepository/eventRepository.server.ts.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const eventRepository = resolveEventRepositoryForStore(
run.taskEventStore,
run.organizationId ?? ""
);
if (!run.organizationId) {
return new Response("Run organization is missing", { status: 500 });
}
const eventRepository = resolveEventRepositoryForStore(
run.taskEventStore,
run.organizationId
);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/routes/resources.runs`.$runParam.logs.download.ts around
lines 36 - 39, The code currently calls
resolveEventRepositoryForStore(run.taskEventStore, run.organizationId ?? "")
which masks missing TaskRun.organizationId and can route to a default repo;
change this to explicitly check run.organizationId (e.g., if
(!run.organizationId) throw new Error or return an appropriate HTTP error)
before calling resolveEventRepositoryForStore so tenant-scoped resolution fails
fast when organizationId is null/undefined; update the call sites that use
resolveEventRepositoryForStore and references to
run.taskEventStore/run.organizationId to follow the same explicit-null-check
pattern.

Comment on lines +157 to +159
function hashHostname(url: string): string {
const parsed = new URL(url);
return createHash("sha256").update(parsed.hostname).digest("hex");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Use the full data-store identity in the cache key.

Hashing only parsed.hostname lets different org stores on the same host collide in _clientCache and _eventRepositoryCache. If two orgs use the same ClickHouse host with different databases/credentials/path/query params, the second org can reuse the first org’s client/repository and hit the wrong tenant store.

🔧 Proposed fix
-function hashHostname(url: string): string {
+function hashDataStore(url: string): string {
   const parsed = new URL(url);
-  return createHash("sha256").update(parsed.hostname).digest("hex");
+  parsed.searchParams.delete("secure");
+  return createHash("sha256").update(parsed.toString()).digest("hex");
 }
@@
-    const hostnameHash = hashHostname(dataStore.url);
-    const cacheKey = `${hostnameHash}:${clientType}`;
+    const dataStoreHash = hashDataStore(dataStore.url);
+    const cacheKey = `${dataStoreHash}:${clientType}`;
@@
-    const hostnameHash = hashHostname(dataStore.url);
-    const cacheKey = `${hostnameHash}:events:${store}`;
+    const dataStoreHash = hashDataStore(dataStore.url);
+    const cacheKey = `${dataStoreHash}:events:${store}`;

Also applies to: 230-236, 270-277

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/services/clickhouse/clickhouseFactory.server.ts` around lines
157 - 159, The cache key uses hashHostname which only hashes parsed.hostname
causing collisions across different org stores on the same host; update
hashHostname (or replace with a new helper used by _clientCache and
_eventRepositoryCache) to hash the full datastore identity (e.g., the complete
URL including origin, pathname and search/query, or a canonical JSON of
connection options such as host, database, credentials and path) instead of only
parsed.hostname so each tenant+database+params yields a unique key; ensure all
places that call hashHostname (including usages tied to _clientCache and
_eventRepositoryCache) are updated to use the new full-identity hash.

Comment on lines +164 to +178
function buildOrgClickhouseClient(url: string, clientType: ClientType): ClickHouse {
const parsed = new URL(url);
parsed.searchParams.delete("secure");

return new ClickHouse({
url: parsed.toString(),
name: `org-clickhouse-${clientType}`,
keepAlive: {
enabled: env.CLICKHOUSE_KEEP_ALIVE_ENABLED === "1",
idleSocketTtl: env.CLICKHOUSE_KEEP_ALIVE_IDLE_SOCKET_TTL_MS,
},
logLevel: env.CLICKHOUSE_LOG_LEVEL,
compression: { request: true },
maxOpenConnections: env.CLICKHOUSE_MAX_OPEN_CONNECTIONS,
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Preserve the per-client tuning for org-scoped clients.

buildOrgClickhouseClient() always uses the generic CLICKHOUSE_* settings, so org-scoped "events" clients lose the EVENTS_CLICKHOUSE_* pool/compression settings, "replication" loses the RUN_REPLICATION_* settings, and "logs" drops the extra clickhouseSettings safety caps from initializeLogsClickhouseClient(). That makes the org-scoped path behave differently from the default path on the exact hot paths this PR calls out.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/services/clickhouse/clickhouseFactory.server.ts` around lines
164 - 178, The org-scoped builder buildOrgClickhouseClient currently applies
only the generic CLICKHOUSE_* env settings; update it to branch on the
clientType ("events", "replication", "logs", etc.) and apply the per-client
tuning (use EVENTS_CLICKHOUSE_* for events clients for
pool/compression/maxOpenConnections, use RUN_REPLICATION_* for replication
clients, and include the same clickhouseSettings/safety caps used by
initializeLogsClickhouseClient for logs). Locate buildOrgClickhouseClient and
add conditional logic to select the appropriate env vars and clickhouse options
(keepAlive, compression, maxOpenConnections, and clickhouseSettings) based on
clientType so org-scoped clients match the non-org default behavior.

Comment on lines +190 to +193
const eventRepository = resolveEventRepositoryForStore(
run.taskEventStore,
taskRun.organizationId ?? ""
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid resolving org-scoped repositories with "" as the org ID.

If organizationId is ever missing here, this will still perform a lookup against an empty org key and can write attempt/cached events to the wrong backend instead of failing fast. Please guard and bail out, or plumb a real org ID into these paths.

Proposed fix
-    const eventRepository = resolveEventRepositoryForStore(
-      run.taskEventStore,
-      taskRun.organizationId ?? ""
-    );
+    if !taskRun.organizationId {
+      logger.error("[runAttemptFailed] Missing organizationId for task run", {
+        runId: run.id,
+      });
+      return;
+    }
+
+    const eventRepository = resolveEventRepositoryForStore(
+      run.taskEventStore,
+      taskRun.organizationId
+    );
-      const eventRepository = resolveEventRepositoryForStore(
-        blockedRun.taskEventStore,
-        blockedRun.organizationId ?? ""
-      );
+      if (!blockedRun.organizationId) {
+        logger.error("[cachedRunCompleted] Missing organizationId for blocked run", {
+          blockedRunId,
+        });
+        return;
+      }
+
+      const eventRepository = resolveEventRepositoryForStore(
+        blockedRun.taskEventStore,
+        blockedRun.organizationId
+      );

Also applies to: 294-297

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/v3/runEngineHandlers.server.ts` around lines 190 - 193, The
code calls resolveEventRepositoryForStore(run.taskEventStore,
taskRun.organizationId ?? "") which uses an empty string org ID when
organizationId is missing; change this to validate taskRun.organizationId (and
any similar call sites such as the other resolveEventRepositoryForStore usages)
and bail out or throw if it's falsy instead of passing "" — e.g., check
taskRun.organizationId before calling resolveEventRepositoryForStore, return an
error/early exit (or propagate) when missing, and only call
resolveEventRepositoryForStore with the validated non-empty organizationId to
avoid writing events to the wrong backend.

Comment on lines +166 to +169
const eventRepository = resolveEventRepositoryForStore(
taskRunAttempt.taskRun.taskEventStore,
taskRunAttempt.taskRun.organizationId ?? ""
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Remove ?? "" fallback for organization ID in all three repository resolutions.

These hot-path calls should not default to an empty tenant key. If organizationId is unexpectedly missing, handle it explicitly (guard + error) instead of silently resolving with "".

💡 Suggested refactor pattern
-const eventRepository = resolveEventRepositoryForStore(
-  taskRunAttempt.taskRun.taskEventStore,
-  taskRunAttempt.taskRun.organizationId ?? ""
-);
+const organizationId = taskRunAttempt.taskRun.organizationId;
+if (!organizationId) {
+  logger.error("[CompleteAttemptService] Missing organizationId on task run", {
+    runId: taskRunAttempt.taskRun.id,
+    taskEventStore: taskRunAttempt.taskRun.taskEventStore,
+  });
+  return "COMPLETED";
+}
+
+const eventRepository = resolveEventRepositoryForStore(
+  taskRunAttempt.taskRun.taskEventStore,
+  organizationId
+);

Apply this consistently to the success, failure, and retry paths listed above.

Also applies to: 322-325, 547-550

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/v3/services/completeAttempt.server.ts` around lines 166 -
169, The current calls to resolveEventRepositoryForStore (and the two other
repository resolution calls at the success, failure, and retry paths) use a
fallback organizationId of "" via taskRunAttempt.taskRun.organizationId ?? "",
which silently masks missing tenant IDs; change each to explicitly guard for a
missing organizationId on taskRunAttempt.taskRun and throw or return an explicit
error instead of using "" — locate usages of resolveEventRepositoryForStore and
the analogous repository resolution functions near the success/failure/retry
paths (also repeated around the other occurrences) and replace the fallback with
a check like if (!taskRunAttempt.taskRun.organizationId) { throw new
Error("Missing organizationId for taskRunAttempt"); } before calling the
resolver so the resolver always receives a real organizationId.

Comment on lines +1 to +3
import { describe, expect } from "vitest";

vi.mock("~/db.server", () => ({ prisma: {}, $replica: {} }));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing vi import from vitest.

The test uses vi.mock and vi.setConfig but only imports describe and expect from vitest. Add vi to the import.

Proposed fix
-import { describe, expect } from "vitest";
+import { describe, expect, vi } from "vitest";
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
import { describe, expect } from "vitest";
vi.mock("~/db.server", () => ({ prisma: {}, $replica: {} }));
import { describe, expect, vi } from "vitest";
vi.mock("~/db.server", () => ({ prisma: {}, $replica: {} }));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/test/clickhouseFactory.test.ts` around lines 1 - 3, The test
imports only describe and expect from vitest but uses vi (e.g., vi.mock and
vi.setConfig); update the import statement in clickhouseFactory.test.ts to also
import vi from "vitest" so calls like vi.mock(...) and vi.setConfig(...)
reference the vitest mock API correctly.

@matt-aitken matt-aitken force-pushed the org-scoped-clickhouse branch from 6179dc7 to 66f294e Compare April 15, 2026 16:52
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
apps/webapp/app/routes/admin.data-stores.tsx (2)

305-307: ⚠️ Potential issue | 🟡 Minor

Close these dialogs after submission completes, not during render.

Both components mutate state from render based on fetcher.data?.success. Besides the React warning, the persisted success payload means the dialog will immediately close again on the next open. Gate the close behind a submit→idle transition in an effect/ref instead.

🔧 One way to fix the pattern
-import { useState } from "react";
+import { useEffect, useRef, useState } from "react";
   const fetcher = useFetcher<{ success?: boolean; error?: string }>();
   const isSubmitting = fetcher.state !== "idle";
+  const wasSubmittingRef = useRef(false);

-  if (fetcher.data?.success && open) {
-    setOpen(false);
-  }
+  useEffect(() => {
+    const justFinished = wasSubmittingRef.current && fetcher.state === "idle";
+    if (justFinished && fetcher.data?.success && open) {
+      setOpen(false);
+    }
+    wasSubmittingRef.current = fetcher.state !== "idle";
+  }, [fetcher.state, fetcher.data?.success, open]);
   const fetcher = useFetcher<{ success?: boolean; error?: string }>();
   const isSubmitting = fetcher.state !== "idle";
+  const wasSubmittingRef = useRef(false);

-  // Close dialog on success
-  if (fetcher.data?.success && open) {
-    onOpenChange(false);
-  }
+  useEffect(() => {
+    const justFinished = wasSubmittingRef.current && fetcher.state === "idle";
+    if (justFinished && fetcher.data?.success && open) {
+      onOpenChange(false);
+    }
+    wasSubmittingRef.current = fetcher.state !== "idle";
+  }, [fetcher.state, fetcher.data?.success, open, onOpenChange]);

Also applies to: 375-377

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/routes/admin.data-stores.tsx` around lines 305 - 307, The
render currently closes dialogs directly when fetcher.data?.success is truthy
(e.g., the if (fetcher.data?.success && open) { setOpen(false); } lines), which
must be moved into an effect that detects a submit→idle transition; implement a
useEffect that tracks fetcher.state (with a ref for previous state) and when
prevState === 'submitting' && fetcher.state === 'idle' && fetcher.data?.success
is true, call setOpen(false); apply the same change to the other similar block
(lines 375-377) so closing is only triggered after submission completes rather
than during render or on subsequent re-renders.

31-31: ⚠️ Potential issue | 🟡 Minor

Import tryCatch from the core utils subpath.

This still pulls from the package root instead of the exported subpath used elsewhere in the webapp.

♻️ Proposed fix
-import { tryCatch } from "@trigger.dev/core";
+import { tryCatch } from "@trigger.dev/core/utils";

As per coding guidelines, "Import from @trigger.dev/core subpaths only, never from the root."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/routes/admin.data-stores.tsx` at line 31, Replace the
root-package import of tryCatch with the core utils subpath: update the import
statement that currently imports tryCatch from "@trigger.dev/core" to import
from the package's utils subpath (e.g., "@trigger.dev/core/utils/tryCatch") so
the tryCatch symbol is pulled from the exported utils subpath consistent with
other imports.
🧹 Nitpick comments (5)
apps/webapp/app/routes/admin.api.v1.runs-replication.create.ts (1)

15-17: Unused parameters in schema after refactoring to factory-based client.

keepAliveEnabled, keepAliveIdleSocketTtl, and maxOpenConnections are parsed from the request but no longer used since the service now receives clickhouseFactory instead of constructing its own ClickHouse client with these options.

Consider removing these from the schema if they're no longer needed, or document if they're kept for future use.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/routes/admin.api.v1.runs-replication.create.ts` around lines
15 - 17, Remove the now-unused keepAliveEnabled, keepAliveIdleSocketTtl, and
maxOpenConnections fields from the Zod request schema used by the admin API
runs-replication create route (the schema that currently parses those three
fields) since the service now accepts clickhouseFactory instead of constructing
a ClickHouse client; alternatively, if you intend to keep them for future use,
add a clear comment above the schema explaining they are intentionally preserved
and currently unused. Ensure any code that referenced these properties
(parsing/validation logic for the route) is updated accordingly so no unused
params remain.
apps/webapp/app/v3/eventRepository/index.server.ts (1)

86-92: Use EVENT_STORE_TYPES constants instead of string literals.

Lines 86 and 90 use raw string literals "clickhouse_v2" and "clickhouse" while EVENT_STORE_TYPES.CLICKHOUSE_V2 and EVENT_STORE_TYPES.CLICKHOUSE are defined and used elsewhere in this file (e.g., lines 64, 68). Consider using the constants for consistency.

♻️ Proposed fix
-  if (taskEventStore === "clickhouse_v2") {
+  if (taskEventStore === EVENT_STORE_TYPES.CLICKHOUSE_V2) {
     return { repository: resolvedRepository, store: "clickhouse_v2" };
   }

-  if (taskEventStore === "clickhouse") {
+  if (taskEventStore === EVENT_STORE_TYPES.CLICKHOUSE) {
     return { repository: resolvedRepository, store: "clickhouse" };
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/v3/eventRepository/index.server.ts` around lines 86 - 92,
Replace the raw string comparisons for taskEventStore with the existing
EVENT_STORE_TYPES constants: change the conditions that check for
"clickhouse_v2" and "clickhouse" to use EVENT_STORE_TYPES.CLICKHOUSE_V2 and
EVENT_STORE_TYPES.CLICKHOUSE respectively, and return the same { repository:
resolvedRepository, store: EVENT_STORE_TYPES.CLICKHOUSE_V2 } / { repository:
resolvedRepository, store: EVENT_STORE_TYPES.CLICKHOUSE } so the code
consistently uses the EVENT_STORE_TYPES enum instead of string literals.
apps/webapp/app/presenters/v3/UsagePresenter.server.ts (1)

121-132: Remove unused replica parameter.

The replica parameter is declared but never used in getTaskUsageByOrganization. Consider removing it to clean up the function signature.

♻️ Proposed fix
 async function getTaskUsageByOrganization(
   organizationId: string,
   startOfMonth: Date,
-  endOfMonth: Date,
-  replica: PrismaClientOrTransaction
+  endOfMonth: Date
 ) {
   const clickhouse = await clickhouseFactory.getClickhouseForOrganization(organizationId, "standard");

And update the call site at line 107-112:

     const tasks = await getTaskUsageByOrganization(
       organizationId,
       startOfMonth,
-      endOfMonth,
-      this._replica
+      endOfMonth
     );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/presenters/v3/UsagePresenter.server.ts` around lines 121 -
132, The function getTaskUsageByOrganization declares an unused parameter
replica—remove replica from the function signature and any related type
annotations so the function accepts (organizationId: string, startOfMonth: Date,
endOfMonth: Date) only; update all call sites that pass a replica (e.g., where
getTaskUsageByOrganization(...) is invoked) to stop supplying that argument, and
run a quick search for references to getTaskUsageByOrganization to ensure
callers and imports compile; no changes needed to the internal calls to
clickhouseFactory.getClickhouseForOrganization or
clickhouse.taskRuns.getTaskUsageByOrganization.
apps/webapp/app/v3/runEngineHandlers.server.ts (2)

63-66: Inconsistent source for taskEventStore across handlers.

These handlers use run.taskEventStore from the event payload, while runExpired (line 358) and runCancelled (line 412) use taskRun.taskEventStore from the database query. Since taskRun is already fetched from the DB with taskEventStore selected, consider using taskRun.taskEventStore consistently across all handlers for a single source of truth.

Suggested fix for consistency
     const eventRepository = resolveEventRepositoryForStore(
-      run.taskEventStore,
+      taskRun.taskEventStore,
       taskRun.organizationId ?? organization.id
     );

Apply this pattern to both runSucceeded (lines 63-66) and runFailed (lines 133-136).

Also applies to: 133-136

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/v3/runEngineHandlers.server.ts` around lines 63 - 66, Use the
DB-fetched taskRun.taskEventStore as the single source of truth instead of the
event payload’s run.taskEventStore: in the runSucceeded and runFailed handlers
replace references to run.taskEventStore with taskRun.taskEventStore (the same
field used in runExpired and runCancelled) when calling
resolveEventRepositoryForStore so all handlers consistently use
taskRun.taskEventStore.

447-450: Consider extracting "taskEvent" to a named constant.

The fallback "taskEvent" appears to be a default store identifier. Per coding guidelines, sentinel/placeholder values should use named constants to avoid scattered magic strings and potential typos.

Suggested refactor
+const DEFAULT_TASK_EVENT_STORE = "taskEvent";
+
 // ... in the handler:
       const eventRepository = resolveEventRepositoryForStore(
-        run.taskEventStore ?? "taskEvent",
+        run.taskEventStore ?? DEFAULT_TASK_EVENT_STORE,
         organization.id
       );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/v3/runEngineHandlers.server.ts` around lines 447 - 450,
Extract the magic fallback string "taskEvent" into a named constant (e.g.,
DEFAULT_TASK_EVENT_STORE or TASK_EVENT_STORE) and replace the inline literal in
the call to resolveEventRepositoryForStore where run.taskEventStore ??
"taskEvent" is used; update any other usages of the same sentinel value to
reference the constant to avoid duplicated magic strings and typos, and ensure
the constant is exported/placed in a shared config/constants module if used
across files.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/webapp/app/routes/admin.data-stores.tsx`:
- Around line 91-94: The trimmed organizationIds array can end up empty (e.g.,
rawOrgIds=", , ") and should be rejected; after computing organizationIds from
rawOrgIds (the .split(",").map(s=>s.trim()).filter(Boolean) logic), validate
that organizationIds.length > 0 and return/throw a validation error (or add a
form error) if it’s zero so create/update is blocked; apply the same check to
the second occurrence of this logic (the block around lines 115-118) to ensure
normalized arrays are validated rather than the raw string.

---

Duplicate comments:
In `@apps/webapp/app/routes/admin.data-stores.tsx`:
- Around line 305-307: The render currently closes dialogs directly when
fetcher.data?.success is truthy (e.g., the if (fetcher.data?.success && open) {
setOpen(false); } lines), which must be moved into an effect that detects a
submit→idle transition; implement a useEffect that tracks fetcher.state (with a
ref for previous state) and when prevState === 'submitting' && fetcher.state ===
'idle' && fetcher.data?.success is true, call setOpen(false); apply the same
change to the other similar block (lines 375-377) so closing is only triggered
after submission completes rather than during render or on subsequent
re-renders.
- Line 31: Replace the root-package import of tryCatch with the core utils
subpath: update the import statement that currently imports tryCatch from
"@trigger.dev/core" to import from the package's utils subpath (e.g.,
"@trigger.dev/core/utils/tryCatch") so the tryCatch symbol is pulled from the
exported utils subpath consistent with other imports.

---

Nitpick comments:
In `@apps/webapp/app/presenters/v3/UsagePresenter.server.ts`:
- Around line 121-132: The function getTaskUsageByOrganization declares an
unused parameter replica—remove replica from the function signature and any
related type annotations so the function accepts (organizationId: string,
startOfMonth: Date, endOfMonth: Date) only; update all call sites that pass a
replica (e.g., where getTaskUsageByOrganization(...) is invoked) to stop
supplying that argument, and run a quick search for references to
getTaskUsageByOrganization to ensure callers and imports compile; no changes
needed to the internal calls to clickhouseFactory.getClickhouseForOrganization
or clickhouse.taskRuns.getTaskUsageByOrganization.

In `@apps/webapp/app/routes/admin.api.v1.runs-replication.create.ts`:
- Around line 15-17: Remove the now-unused keepAliveEnabled,
keepAliveIdleSocketTtl, and maxOpenConnections fields from the Zod request
schema used by the admin API runs-replication create route (the schema that
currently parses those three fields) since the service now accepts
clickhouseFactory instead of constructing a ClickHouse client; alternatively, if
you intend to keep them for future use, add a clear comment above the schema
explaining they are intentionally preserved and currently unused. Ensure any
code that referenced these properties (parsing/validation logic for the route)
is updated accordingly so no unused params remain.

In `@apps/webapp/app/v3/eventRepository/index.server.ts`:
- Around line 86-92: Replace the raw string comparisons for taskEventStore with
the existing EVENT_STORE_TYPES constants: change the conditions that check for
"clickhouse_v2" and "clickhouse" to use EVENT_STORE_TYPES.CLICKHOUSE_V2 and
EVENT_STORE_TYPES.CLICKHOUSE respectively, and return the same { repository:
resolvedRepository, store: EVENT_STORE_TYPES.CLICKHOUSE_V2 } / { repository:
resolvedRepository, store: EVENT_STORE_TYPES.CLICKHOUSE } so the code
consistently uses the EVENT_STORE_TYPES enum instead of string literals.

In `@apps/webapp/app/v3/runEngineHandlers.server.ts`:
- Around line 63-66: Use the DB-fetched taskRun.taskEventStore as the single
source of truth instead of the event payload’s run.taskEventStore: in the
runSucceeded and runFailed handlers replace references to run.taskEventStore
with taskRun.taskEventStore (the same field used in runExpired and runCancelled)
when calling resolveEventRepositoryForStore so all handlers consistently use
taskRun.taskEventStore.
- Around line 447-450: Extract the magic fallback string "taskEvent" into a
named constant (e.g., DEFAULT_TASK_EVENT_STORE or TASK_EVENT_STORE) and replace
the inline literal in the call to resolveEventRepositoryForStore where
run.taskEventStore ?? "taskEvent" is used; update any other usages of the same
sentinel value to reference the constant to avoid duplicated magic strings and
typos, and ensure the constant is exported/placed in a shared config/constants
module if used across files.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 72fb2023-f0f8-4d55-80e6-70728d83f0fe

📥 Commits

Reviewing files that changed from the base of the PR and between 6179dc7 and 66f294e.

📒 Files selected for processing (79)
  • .cursor/mcp.json
  • .server-changes/organization-scoped-clickhouse.md
  • CLAUDE.md
  • apps/webapp/app/env.server.ts
  • apps/webapp/app/presenters/v3/ApiRunListPresenter.server.ts
  • apps/webapp/app/presenters/v3/CreateBulkActionPresenter.server.ts
  • apps/webapp/app/presenters/v3/RunPresenter.server.ts
  • apps/webapp/app/presenters/v3/RunTagListPresenter.server.ts
  • apps/webapp/app/presenters/v3/SpanPresenter.server.ts
  • apps/webapp/app/presenters/v3/TaskListPresenter.server.ts
  • apps/webapp/app/presenters/v3/UsagePresenter.server.ts
  • apps/webapp/app/presenters/v3/ViewSchedulePresenter.server.ts
  • apps/webapp/app/presenters/v3/WaitpointPresenter.server.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.dashboards.$dashboardKey/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.errors.$fingerprint/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.errors._index/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.logs/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.models.$modelId/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.models._index/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.models.compare/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.prompts.$promptSlug/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.prompts._index/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.runs.$runParam/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.runs._index/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.test.tasks.$taskParam/route.tsx
  • apps/webapp/app/routes/admin.api.v1.runs-replication.create.ts
  • apps/webapp/app/routes/admin.api.v1.runs-replication.start.ts
  • apps/webapp/app/routes/admin.data-stores.tsx
  • apps/webapp/app/routes/admin.tsx
  • apps/webapp/app/routes/api.v1.prompts.$slug.ts
  • apps/webapp/app/routes/api.v1.prompts.$slug.versions.ts
  • apps/webapp/app/routes/api.v1.prompts._index.ts
  • apps/webapp/app/routes/api.v1.runs.$runId.events.ts
  • apps/webapp/app/routes/api.v1.runs.$runId.spans.$spanId.ts
  • apps/webapp/app/routes/api.v1.runs.$runId.trace.ts
  • apps/webapp/app/routes/otel.v1.logs.ts
  • apps/webapp/app/routes/otel.v1.metrics.ts
  • apps/webapp/app/routes/otel.v1.traces.ts
  • apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.logs.$logId.tsx
  • apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.logs.ts
  • apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.prompts.$promptSlug.generations.ts
  • apps/webapp/app/routes/resources.runs.$runParam.logs.download.ts
  • apps/webapp/app/runEngine/concerns/traceEvents.server.ts
  • apps/webapp/app/runEngine/services/triggerFailedTask.server.ts
  • apps/webapp/app/services/admin/missingLlmModels.server.ts
  • apps/webapp/app/services/clickhouse/clickhouseFactory.server.ts
  • apps/webapp/app/services/clickhouse/clickhouseSecretSchemas.server.ts
  • apps/webapp/app/services/clickhouseInstance.server.ts
  • apps/webapp/app/services/dataStores/organizationDataStoreConfigSchemas.server.ts
  • apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts
  • apps/webapp/app/services/dataStores/organizationDataStoresRegistryInstance.server.ts
  • apps/webapp/app/services/queryService.server.ts
  • apps/webapp/app/services/runsReplicationInstance.server.ts
  • apps/webapp/app/services/runsReplicationService.server.ts
  • apps/webapp/app/v3/eventRepository/clickhouseEventRepository.server.ts
  • apps/webapp/app/v3/eventRepository/clickhouseEventRepositoryInstance.server.ts
  • apps/webapp/app/v3/eventRepository/eventRepository.server.ts
  • apps/webapp/app/v3/eventRepository/eventRepository.types.ts
  • apps/webapp/app/v3/eventRepository/index.server.ts
  • apps/webapp/app/v3/otlpExporter.server.ts
  • apps/webapp/app/v3/runEngineHandlers.server.ts
  • apps/webapp/app/v3/services/alerts/errorAlertEvaluator.server.ts
  • apps/webapp/app/v3/services/bulk/BulkActionV2.server.ts
  • apps/webapp/app/v3/services/cancelTaskRunV1.server.ts
  • apps/webapp/app/v3/services/completeAttempt.server.ts
  • apps/webapp/app/v3/services/crashTaskRun.server.ts
  • apps/webapp/app/v3/services/expireEnqueuedRun.server.ts
  • apps/webapp/app/v3/services/triggerTaskV1.server.ts
  • apps/webapp/test/clickhouseFactory.test.ts
  • apps/webapp/test/organizationDataStoresRegistry.test.ts
  • apps/webapp/test/runsBackfiller.test.ts
  • apps/webapp/test/runsReplicationBenchmark.test.ts
  • apps/webapp/test/runsReplicationService.part1.test.ts
  • apps/webapp/test/runsReplicationService.part2.test.ts
  • apps/webapp/test/utils/replicationUtils.ts
  • apps/webapp/test/utils/testReplicationClickhouseFactory.ts
  • apps/webapp/test/utils/tracing.ts
  • internal-packages/database/prisma/migrations/20260331212308_add_organization_data_stores/migration.sql
  • internal-packages/database/prisma/schema.prisma
💤 Files with no reviewable changes (2)
  • apps/webapp/app/services/clickhouseInstance.server.ts
  • apps/webapp/app/v3/eventRepository/clickhouseEventRepositoryInstance.server.ts
✅ Files skipped from review due to trivial changes (18)
  • apps/webapp/app/v3/services/expireEnqueuedRun.server.ts
  • apps/webapp/app/routes/admin.tsx
  • CLAUDE.md
  • .server-changes/organization-scoped-clickhouse.md
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.models._index/route.tsx
  • apps/webapp/app/routes/resources.runs.$runParam.logs.download.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.errors._index/route.tsx
  • apps/webapp/app/v3/services/bulk/BulkActionV2.server.ts
  • apps/webapp/app/v3/services/completeAttempt.server.ts
  • apps/webapp/test/runsReplicationService.part1.test.ts
  • apps/webapp/app/env.server.ts
  • apps/webapp/app/services/clickhouse/clickhouseSecretSchemas.server.ts
  • apps/webapp/test/utils/testReplicationClickhouseFactory.ts
  • internal-packages/database/prisma/migrations/20260331212308_add_organization_data_stores/migration.sql
  • internal-packages/database/prisma/schema.prisma
  • apps/webapp/test/organizationDataStoresRegistry.test.ts
  • apps/webapp/app/services/dataStores/organizationDataStoreConfigSchemas.server.ts
  • apps/webapp/test/runsReplicationService.part2.test.ts
🚧 Files skipped from review as they are similar to previous changes (34)
  • .cursor/mcp.json
  • apps/webapp/app/routes/admin.api.v1.runs-replication.start.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.models.compare/route.tsx
  • apps/webapp/app/routes/otel.v1.traces.ts
  • apps/webapp/app/v3/services/triggerTaskV1.server.ts
  • apps/webapp/app/services/queryService.server.ts
  • apps/webapp/app/routes/otel.v1.metrics.ts
  • apps/webapp/test/utils/tracing.ts
  • apps/webapp/test/utils/replicationUtils.ts
  • apps/webapp/app/runEngine/concerns/traceEvents.server.ts
  • apps/webapp/app/routes/api.v1.runs.$runId.events.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.prompts._index/route.tsx
  • apps/webapp/app/presenters/v3/WaitpointPresenter.server.ts
  • apps/webapp/app/presenters/v3/CreateBulkActionPresenter.server.ts
  • apps/webapp/app/routes/api.v1.runs.$runId.trace.ts
  • apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.logs.$logId.tsx
  • apps/webapp/app/routes/api.v1.runs.$runId.spans.$spanId.ts
  • apps/webapp/test/runsBackfiller.test.ts
  • apps/webapp/app/routes/api.v1.prompts._index.ts
  • apps/webapp/app/services/runsReplicationInstance.server.ts
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.dashboards.$dashboardKey/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.errors.$fingerprint/route.tsx
  • apps/webapp/app/routes/otel.v1.logs.ts
  • apps/webapp/app/presenters/v3/RunTagListPresenter.server.ts
  • apps/webapp/app/v3/eventRepository/eventRepository.types.ts
  • apps/webapp/app/v3/eventRepository/eventRepository.server.ts
  • apps/webapp/app/presenters/v3/RunPresenter.server.ts
  • apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.logs.ts
  • apps/webapp/app/services/runsReplicationService.server.ts
  • apps/webapp/app/services/admin/missingLlmModels.server.ts
  • apps/webapp/app/v3/eventRepository/clickhouseEventRepository.server.ts
  • apps/webapp/app/v3/services/crashTaskRun.server.ts
  • apps/webapp/app/presenters/v3/TaskListPresenter.server.ts
  • apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts

Comment on lines +91 to +94
const organizationIds = rawOrgIds
.split(",")
.map((s) => s.trim())
.filter(Boolean);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Reject org lists that collapse to zero IDs after trimming.

Values like ", , " pass the current schema, become [], and then create/update an unusable data store with no organizations attached. Validate the normalized array, not just the raw string.

🛡️ Proposed fix
       const organizationIds = rawOrgIds
         .split(",")
         .map((s) => s.trim())
         .filter(Boolean);
+
+      if (organizationIds.length === 0) {
+        return typedjson({ error: "At least one organization ID is required" }, { status: 400 });
+      }

Also applies to: 115-118

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/routes/admin.data-stores.tsx` around lines 91 - 94, The
trimmed organizationIds array can end up empty (e.g., rawOrgIds=", , ") and
should be rejected; after computing organizationIds from rawOrgIds (the
.split(",").map(s=>s.trim()).filter(Boolean) logic), validate that
organizationIds.length > 0 and return/throw a validation error (or add a form
error) if it’s zero so create/update is blocked; apply the same check to the
second occurrence of this logic (the block around lines 115-118) to ensure
normalized arrays are validated rather than the raw string.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 6 potential issues.

View 5 additional findings in Devin Review.

Open in Devin Review

Comment on lines +80 to +84
const taskEventStore = parentStore ?? (await resolveTaskEventRepositoryFlag(featureFlags));
const { repository: resolvedRepository } = await clickhouseFactory.getEventRepositoryForOrganization(
taskEventStore,
organizationId
);
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 getV3EventRepository throws for "postgres" parentStore instead of falling back to Postgres event repository

In the refactored getV3EventRepository, when parentStore is a string (e.g. "postgres" from a legacy run's taskEventStore), it is unconditionally passed to clickhouseFactory.getEventRepositoryForOrganization(parentStore, organizationId). This internally calls buildEventRepository(store, ...) at clickhouseFactory.server.ts:336, which only handles "clickhouse" and "clickhouse_v2" — the default case throws "Unknown ClickHouse event repository store: postgres". The old code explicitly handled non-ClickHouse stores by falling back to the Prisma-based eventRepository. This crashes V1 trigger paths (triggerTaskV1.server.ts:296) when a parent run uses the Postgres event store.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in de2996d. getV3EventRepository now falls back to the Prisma-based eventRepository when parentStore is any non-ClickHouse value (e.g. "postgres") instead of handing it to buildEventRepository and throwing.

Comment on lines +248 to +253
const clickhouseClient = await clickhouseFactory.getClickhouseForOrganization(
environment.organizationId,
"query"
);

const presenter = new ErrorGroupPresenter($replica, clickhouseClient, clickhouseClient);
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 ErrorGroupPresenter receives the "logs" ClickHouse client for both constructor parameters instead of "logs" + "standard"

The ErrorGroupPresenter constructor takes three arguments: (replica, logsClickhouse, clickhouse). The logsClickhouse is used for error occurrence/summary queries, while clickhouse is used for the run list via NextRunListPresenter (ErrorGroupPresenter.server.ts:413). The old code passed two distinct clients: new ErrorGroupPresenter($replica, logsClickhouseClient, clickhouseClient). The new code fetches a single client with clientType: "logs" and passes it as both arguments: new ErrorGroupPresenter($replica, clickhouseClient, clickhouseClient). This means the NextRunListPresenter run list queries execute against the logs ClickHouse instance with logs-specific settings (e.g., restrictive max_memory_usage, max_rows_to_read), which may cause those queries to fail or return incomplete results. For org-scoped ClickHouse setups, this also routes run data queries through the wrong client type.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in de2996d. The fingerprint route now fetches two separate clients — "logs" for the error occurrence/summary queries and "standard" for the NextRunListPresenter run list — matching the pre-refactor behavior (logsClickhouseClient + clickhouseClient).

Comment on lines +10 to +12
registry.loadFromDatabase().catch((err) => {
console.error("[OrganizationDataStoresRegistry] Failed to initialize", err);
});
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 isReady promise never resolves if initial loadFromDatabase() fails, causing service-wide hangs

In organizationDataStoresRegistryInstance.server.ts:10-12, if loadFromDatabase() throws (e.g., database unavailable at startup), the error is caught and logged but _readyResolve() in organizationDataStoresRegistry.server.ts:82 is never called. This means registry.isReady never resolves. Any code that calls await clickhouseFactory.isReady() (clickhouseFactory.server.ts:194-196) will hang forever — this includes OTLP exporter initialization (otlpExporter.server.ts:1208), runs replication start (runsReplicationInstance.server.ts:73), and the admin replication create route (admin.api.v1.runs-replication.create.ts:65). OTEL data ingestion and runs replication would become permanently unresponsive after a transient DB failure during startup.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed by c5f7ca3 ("Retry the initial boot using p-retry"): startup now retries loadFromDatabase forever with exponential backoff, so a transient Postgres outage at boot no longer leaves isReady permanently unresolved — it resolves as soon as the DB is reachable.

Comment on lines +164 to +179
function buildOrgClickhouseClient(url: string, clientType: ClientType): ClickHouse {
const parsed = new URL(url);
parsed.searchParams.delete("secure");

return new ClickHouse({
url: parsed.toString(),
name: `org-clickhouse-${clientType}`,
keepAlive: {
enabled: env.CLICKHOUSE_KEEP_ALIVE_ENABLED === "1",
idleSocketTtl: env.CLICKHOUSE_KEEP_ALIVE_IDLE_SOCKET_TTL_MS,
},
logLevel: env.CLICKHOUSE_LOG_LEVEL,
compression: { request: true },
maxOpenConnections: env.CLICKHOUSE_MAX_OPEN_CONNECTIONS,
});
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Org-specific ClickHouse clients don't inherit specialized settings from default clients

The buildOrgClickhouseClient function at clickhouseFactory.server.ts:164-179 creates ClickHouse clients for org-specific data stores using generic settings (the standard CLICKHOUSE_* env vars). It does NOT replicate specialized settings that default clients have — for example, the logs client has max_memory_usage, max_bytes_before_external_sort, max_threads settings (clickhouseFactory.server.ts:58-69), and the replication client uses RUN_REPLICATION_* settings (clickhouseFactory.server.ts:140-150). When an org has a dedicated ClickHouse and a caller requests a "logs" or "replication" client type, they get a generic client without these query protections. This may be intentional (org-dedicated ClickHouse may not need the same guardrails), but could lead to surprising behavior differences between default and org-specific paths.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +94 to +118
async addDataStore({
key,
kind,
organizationIds,
config,
}: {
key: string;
kind: DataStoreKind;
organizationIds: string[];
config: any;
}) {
const secretKey = this.#secretKey(key, kind);

const secretStore = getSecretStore("DATABASE", { prismaClient: this._prisma });
await secretStore.setSecret(secretKey, config);

return this._prisma.organizationDataStore.create({
data: {
key,
organizationIds,
kind: "CLICKHOUSE",
config: { version: 1, data: { secretKey } },
},
});
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 addDataStore/updateDataStore don't call reload() — in-memory registry is stale until next periodic reload

After addDataStore, updateDataStore, or deleteDataStore in organizationDataStoresRegistry.server.ts, the in-memory _lookup map is NOT updated. The changes only persist to the database. The registry won't reflect changes until the next periodic reload interval (default 60s per ORGANIZATION_DATA_STORES_RELOAD_INTERVAL_MS). For the admin UI at admin.data-stores.tsx, this means an admin could add a data store and for up to 60 seconds, traffic for the affected orgs would still route to the default ClickHouse. The admin page doesn't call reload() after mutations either. This is likely acceptable for the initial implementation but could surprise operators expecting immediate effect.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +186 to +189
/** ClickHouse clients keyed by hostname hash + clientType. */
private readonly _clientCache = new Map<string, ClickHouse>();
/** Event repositories keyed by hostname hash (stateful, must be reused). */
private readonly _eventRepositoryCache = new Map<string, ClickhouseEventRepository>();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 Client cache in ClickhouseFactory is never invalidated when data stores change

The _clientCache and _eventRepositoryCache in ClickhouseFactory (clickhouseFactory.server.ts:187-189) are populated when org-specific ClickHouse clients are first created but are never cleared when the underlying OrganizationDataStoresRegistry reloads. If an org's data store URL changes (e.g., during a ClickHouse migration), the old cached client/repository would continue to be used until the process restarts. The registry reload at organizationDataStoresRegistryInstance.server.ts:14-17 updates the _lookup map, but the factory's caches are separate. This is a cache staleness concern for long-running processes.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (2)
apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts (2)

77-80: ⚠️ Potential issue | 🟠 Major

Fail fast on duplicate org+kind assignments.

This loop silently overwrites an existing ${orgId}:${row.kind} entry, so overlapping assignments can produce nondeterministic routing depending on row order. Please throw on conflict instead of last-write-wins.

Suggested fix
       for (const orgId of row.organizationIds) {
         const key = `${orgId}:${row.kind}`;
+        const existing = lookup.get(key);
+        if (existing) {
+          throw new Error(
+            `[OrganizationDataStoresRegistry] Duplicate assignment for ${key} while loading "${row.key}"`
+          );
+        }
         lookup.set(key, parsed);
       }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts`
around lines 77 - 80, The loop that maps organization IDs to kinds silently
overwrites duplicates (for const orgId of row.organizationIds { const key =
`${orgId}:${row.kind}`; lookup.set(key, parsed); }), causing nondeterministic
routing; change this to fail-fast by checking lookup.has(key) before setting and
throw a descriptive error (including orgId and row.kind or row id) when a
conflict is detected so overlapping org+kind assignments cannot be inserted
silently. Ensure the thrown error stops processing so callers see the conflict
immediately.

99-123: ⚠️ Potential issue | 🟠 Major

Refresh _lookup immediately after successful mutations.

get() serves only in-memory state, but add/update/delete only persist to DB/secret store. Routing can remain stale until the periodic reload runs.

Suggested fix
   async addDataStore(...) {
     ...
-    return this._prisma.organizationDataStore.create({
+    const created = await this._prisma.organizationDataStore.create({
       data: {
         key,
         organizationIds,
         kind: "CLICKHOUSE",
         config: { version: 1, data: { secretKey } },
       },
     });
+    await this.reload();
+    return created;
   }

   async updateDataStore(...) {
     ...
-    return this._prisma.organizationDataStore.update({
+    const updated = await this._prisma.organizationDataStore.update({
       where: { key },
       data: {
         organizationIds,
         kind: "CLICKHOUSE",
       },
     });
+    await this.reload();
+    return updated;
   }

   async deleteDataStore(...) {
     ...
     await this._prisma.organizationDataStore.delete({ where: { key } });
+    await this.reload();
   }

Also applies to: 125-152, 154-162

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts`
around lines 99 - 123, addDataStore (and the similar update/delete methods) only
persist to DB/secret store but don't refresh the in-memory routing state, so
call the registry refresh after successful mutations: after
secretStore.setSecret (and after prisma create/update/delete and any
secretStore.removeSecret) await this._lookup() (or the registry's refresh
method) so in-memory state is updated immediately; apply the same change in the
corresponding updateDataStore and deleteDataStore flows referenced in the
comment.
🧹 Nitpick comments (1)
apps/webapp/app/v3/eventRepository/index.server.ts (1)

112-119: Use EVENT_STORE_TYPES here instead of repeating raw literals.

This file already centralizes store IDs in EVENT_STORE_TYPES; duplicating "clickhouse" and "clickhouse_v2" here makes later renames/additions easier to miss.

♻️ Suggested cleanup
-  if (env.EVENT_REPOSITORY_DEFAULT_STORE === "clickhouse_v2") {
+  if (env.EVENT_REPOSITORY_DEFAULT_STORE === EVENT_STORE_TYPES.CLICKHOUSE_V2) {
     const { repository: resolvedRepository } =
-      await clickhouseFactory.getEventRepositoryForOrganization("clickhouse_v2", organizationId);
-    return { repository: resolvedRepository, store: "clickhouse_v2" };
-  } else if (env.EVENT_REPOSITORY_DEFAULT_STORE === "clickhouse") {
+      await clickhouseFactory.getEventRepositoryForOrganization(
+        EVENT_STORE_TYPES.CLICKHOUSE_V2,
+        organizationId
+      );
+    return { repository: resolvedRepository, store: EVENT_STORE_TYPES.CLICKHOUSE_V2 };
+  } else if (env.EVENT_REPOSITORY_DEFAULT_STORE === EVENT_STORE_TYPES.CLICKHOUSE) {
     const { repository: resolvedRepository } =
-      await clickhouseFactory.getEventRepositoryForOrganization("clickhouse", organizationId);
-    return { repository: resolvedRepository, store: "clickhouse" };
+      await clickhouseFactory.getEventRepositoryForOrganization(
+        EVENT_STORE_TYPES.CLICKHOUSE,
+        organizationId
+      );
+    return { repository: resolvedRepository, store: EVENT_STORE_TYPES.CLICKHOUSE };
   } else {
As per coding guidelines, "Use named constants for sentinel/placeholder values (e.g. `const UNSET_VALUE = "__unset__"`) instead of raw string literals scattered across comparisons."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/v3/eventRepository/index.server.ts` around lines 112 - 119,
Replace the hard-coded store strings in the env check with the centralized
constants from EVENT_STORE_TYPES: instead of comparing
env.EVENT_REPOSITORY_DEFAULT_STORE to raw literals "clickhouse_v2" and
"clickhouse", use EVENT_STORE_TYPES.clickhouse_v2 and
EVENT_STORE_TYPES.clickhouse (or whatever exact keys exist on
EVENT_STORE_TYPES). Update both branches that call
clickhouseFactory.getEventRepositoryForOrganization to pass the matching
EVENT_STORE_TYPES value and return the corresponding store constant
(EVENT_STORE_TYPES.clickhouse_v2 or EVENT_STORE_TYPES.clickhouse) so future
renames/additions stay consistent; references to change are
env.EVENT_REPOSITORY_DEFAULT_STORE,
clickhouseFactory.getEventRepositoryForOrganization(...) and the returned store
value.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts`:
- Around line 99-109: The addDataStore function currently accepts config: any
and writes unvalidated data; replace the any with a proper typed schema
validated by zod before persistence: import or define a DataStoreConfigSchema
(in packages/core if available) and validate the incoming config inside
addDataStore (and the corresponding updateDataStore code paths referenced around
lines 130-135), throw or return a validation error if schema.parse/parseAsync
fails, and persist the typed result (not the raw any) so only validated config
is stored; update the function signature to use the inferred TypeScript type
from the zod schema instead of any.
- Around line 154-161: Change the operation order in deleteDataStore so the
organizationDataStore row is removed before deleting its associated secret: call
this._prisma.organizationDataStore.delete({ where: { key } }) first, then
compute the secretKey via this.#secretKey(key, kind) and call
secretStore.deleteSecret(secretKey) (using getSecretStore("DATABASE", {
prismaClient: this._prisma }) as before), still catching errors from secret
deletion to tolerate missing secrets.

In `@apps/webapp/app/v3/eventRepository/index.server.ts`:
- Around line 81-84: When taskEventStore === EVENT_STORE_TYPES.POSTGRES you must
return the actual Postgres store instead of calling the default
getTaskEventStore() which can return a rotated default (e.g. ClickHouse); change
the branch that returns { repository: eventRepository, store:
getTaskEventStore() } to fetch and return the Postgres-specific store (for
example by calling getTaskEventStore(EVENT_STORE_TYPES.POSTGRES) or the
dedicated getter for Postgres such as getPostgresTaskEventStore()) so the
returned store token matches the Postgres repository; update the branch guarding
taskEventStore, referencing taskEventStore, EVENT_STORE_TYPES.POSTGRES,
eventRepository, and getTaskEventStore() in your change.
- Around line 60-61: The ClickHouse factory is being called with inputs that may
be Postgres (taskEventStore, parentStore, foundRun.taskEventStore), which can
route Postgres runs into ClickHouse; update getConfiguredEventRepository,
getV3EventRepository, and recordRunEvent to check EVENT_STORE_TYPES.POSTGRES
first and return the Postgres eventRepository path (or skip the ClickHouse
registry logic) when any relevant store equals EVENT_STORE_TYPES.POSTGRES before
invoking clickhouseFactory.getEventRepositoryForOrganization (and specifically
before the call site using clickhouseFactory.getEventRepositoryForOrganization
that produces resolvedRepository).

---

Duplicate comments:
In
`@apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts`:
- Around line 77-80: The loop that maps organization IDs to kinds silently
overwrites duplicates (for const orgId of row.organizationIds { const key =
`${orgId}:${row.kind}`; lookup.set(key, parsed); }), causing nondeterministic
routing; change this to fail-fast by checking lookup.has(key) before setting and
throw a descriptive error (including orgId and row.kind or row id) when a
conflict is detected so overlapping org+kind assignments cannot be inserted
silently. Ensure the thrown error stops processing so callers see the conflict
immediately.
- Around line 99-123: addDataStore (and the similar update/delete methods) only
persist to DB/secret store but don't refresh the in-memory routing state, so
call the registry refresh after successful mutations: after
secretStore.setSecret (and after prisma create/update/delete and any
secretStore.removeSecret) await this._lookup() (or the registry's refresh
method) so in-memory state is updated immediately; apply the same change in the
corresponding updateDataStore and deleteDataStore flows referenced in the
comment.

---

Nitpick comments:
In `@apps/webapp/app/v3/eventRepository/index.server.ts`:
- Around line 112-119: Replace the hard-coded store strings in the env check
with the centralized constants from EVENT_STORE_TYPES: instead of comparing
env.EVENT_REPOSITORY_DEFAULT_STORE to raw literals "clickhouse_v2" and
"clickhouse", use EVENT_STORE_TYPES.clickhouse_v2 and
EVENT_STORE_TYPES.clickhouse (or whatever exact keys exist on
EVENT_STORE_TYPES). Update both branches that call
clickhouseFactory.getEventRepositoryForOrganization to pass the matching
EVENT_STORE_TYPES value and return the corresponding store constant
(EVENT_STORE_TYPES.clickhouse_v2 or EVENT_STORE_TYPES.clickhouse) so future
renames/additions stay consistent; references to change are
env.EVENT_REPOSITORY_DEFAULT_STORE,
clickhouseFactory.getEventRepositoryForOrganization(...) and the returned store
value.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: bca7a471-e1a1-4493-b56f-9b5697a43186

📥 Commits

Reviewing files that changed from the base of the PR and between 66f294e and c5f7ca3.

📒 Files selected for processing (4)
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.errors.$fingerprint/route.tsx
  • apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts
  • apps/webapp/app/services/dataStores/organizationDataStoresRegistryInstance.server.ts
  • apps/webapp/app/v3/eventRepository/index.server.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.errors.$fingerprint/route.tsx
  • apps/webapp/app/services/dataStores/organizationDataStoresRegistryInstance.server.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Analyze (javascript-typescript)
🧰 Additional context used
📓 Path-based instructions (10)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects instead

Always import from @trigger.dev/sdk. Never use @trigger.dev/sdk/v3 or deprecated client.defineJob when writing Trigger.dev tasks.

Import from @trigger.dev/core using subpaths only, never from the root path.

Files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
  • apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts
{packages/core,apps/webapp}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use zod for validation in packages/core and apps/webapp

Files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
  • apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use function declarations instead of default exports

Prefer static imports over dynamic imports. Only use dynamic import() when circular dependencies cannot be resolved otherwise, code splitting is genuinely needed for performance, or the module must be loaded conditionally at runtime.

Add crumbs as you write code using // @Crumbs line markers or `// `#region` `@crumbs blocks for debug tracing. They are stripped by agentcrumbs strip before merge. Use appropriate namespaces from the defined namespace table (webapp, supervisor, sdk, cli, core, build, react-hooks, redis-worker, run-engine, database, clickhouse, schedule-engine, run-queue).

Files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
  • apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts
**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/otel-metrics.mdc)

**/*.ts: When creating or editing OTEL metrics (counters, histograms, gauges), ensure metric attributes have low cardinality by using only enums, booleans, bounded error codes, or bounded shard IDs
Do not use high-cardinality attributes in OTEL metrics such as UUIDs/IDs (envId, userId, runId, projectId, organizationId), unbounded integers (itemCount, batchSize, retryCount), timestamps (createdAt, startTime), or free-form strings (errorMessage, taskName, queueName)
When exporting OTEL metrics via OTLP to Prometheus, be aware that the exporter automatically adds unit suffixes to metric names (e.g., 'my_duration_ms' becomes 'my_duration_ms_milliseconds', 'my_counter' becomes 'my_counter_total'). Account for these transformations when writing Grafana dashboards or Prometheus queries

Files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
  • apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts
**/*.{js,ts,jsx,tsx,json,md,yaml,yml}

📄 CodeRabbit inference engine (AGENTS.md)

Format code using Prettier before committing

Files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
  • apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts
apps/webapp/**/*.server.ts

📄 CodeRabbit inference engine (apps/webapp/CLAUDE.md)

apps/webapp/**/*.server.ts: Access environment variables via env export from app/env.server.ts. Never use process.env directly.
Always use findFirst instead of findUnique in Prisma queries. findUnique has implicit DataLoader batching with active bugs (uppercase UUIDs returning null, composite key SQL correctness, 5-10x worse performance). findFirst avoids these issues.
Access Prisma database via the singleton instance exported from app/db.server.ts.

Files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
  • apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts
apps/webapp/**/*.{server,test,spec}.ts

📄 CodeRabbit inference engine (apps/webapp/CLAUDE.md)

For testable code, never import env.server.ts in test files. Pass configuration as options instead. Separate testable services (e.g., realtimeClient.server.ts) from singleton factories (e.g., realtimeClientGlobal.server.ts).

Files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
  • apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts
apps/webapp/**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (apps/webapp/CLAUDE.md)

Use named constants for sentinel/placeholder values (e.g. const UNSET_VALUE = "__unset__") instead of raw string literals scattered across comparisons.

Files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
  • apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts
apps/webapp/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

apps/webapp/**/*.{ts,tsx}: Access environment variables through the env export of env.server.ts instead of directly accessing process.env
Use subpath exports from @trigger.dev/core package instead of importing from the root @trigger.dev/core path

Files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
  • apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts
apps/webapp/app/v3/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Only modify V2 code paths in services. Do not modify V1-only legacy code paths (MarQS queue, triggerTaskV1, cancelTaskRunV1, etc.) unless explicitly instructed. All new work uses Run Engine 2.0 and redis-worker.

Files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
🧠 Learnings (16)
📓 Common learnings
Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-16T13:03:24.754Z
Learning: Use `typecheck` for verification in apps and internal packages (`apps/*`, `internal-packages/*`). Never use `build` for these — building proves almost nothing about correctness. Use `build` only for public packages (`packages/*`).
Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-16T13:03:24.754Z
Learning: When modifying public packages (`packages/*` or `integrations/*`), add a changeset using `pnpm run changeset:add`. Default to patch for bug fixes and minor changes, confirm with maintainers before selecting minor for new features, and never select major without explicit approval.
Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-16T13:03:24.754Z
Learning: When modifying only server components (`apps/webapp/`, `apps/supervisor/`, etc.) with no package changes, add a `.server-changes/` file instead of a changeset.
Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-16T13:03:24.754Z
Learning: Do not flag `// crumbs` markers, `#region crumbs` blocks, or agentcrumbs imports during code review. These are temporary debug instrumentation stripped before merge.
📚 Learning: 2025-07-12T18:06:04.133Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 2264
File: apps/webapp/app/services/runsRepository.server.ts:172-174
Timestamp: 2025-07-12T18:06:04.133Z
Learning: In apps/webapp/app/services/runsRepository.server.ts, the in-memory status filtering after fetching runs from Prisma is intentionally used as a workaround for ClickHouse data delays. This approach is acceptable because the result set is limited to a maximum of 100 runs due to pagination, making the performance impact negligible.

Applied to files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
📚 Learning: 2026-03-26T10:02:25.354Z
Learnt from: 0ski
Repo: triggerdotdev/trigger.dev PR: 3254
File: apps/webapp/app/services/platformNotifications.server.ts:363-385
Timestamp: 2026-03-26T10:02:25.354Z
Learning: In `triggerdotdev/trigger.dev`, the `getNextCliNotification` fallback in `apps/webapp/app/services/platformNotifications.server.ts` intentionally uses `prisma.orgMember.findFirst` (single org) when no `projectRef` is provided. This is acceptable for v1 because the CLI (`dev` and `login` commands) always passes `projectRef` in normal usage, making the fallback a rare edge case. Do not flag the single-org fallback as a multi-org correctness bug in this file.

Applied to files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
  • apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts
📚 Learning: 2026-04-13T21:44:00.032Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3368
File: apps/webapp/app/services/taskIdentifierRegistry.server.ts:24-67
Timestamp: 2026-04-13T21:44:00.032Z
Learning: In `apps/webapp/app/services/taskIdentifierRegistry.server.ts`, the sequential upsert/updateMany/findMany writes in `syncTaskIdentifiers` are intentionally NOT wrapped in a Prisma transaction. This function runs only during deployment-change events (low-concurrency path), and any partial `isInLatestDeployment` state is acceptable because it self-corrects on the next deployment. Do not flag this as a missing-transaction/atomicity issue in future reviews.

Applied to files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
  • apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts
📚 Learning: 2026-04-15T15:39:22.674Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: apps/webapp/CLAUDE.md:0-0
Timestamp: 2026-04-15T15:39:22.674Z
Learning: Applies to apps/webapp/{app/v3/services/triggerTask.server.ts,app/v3/services/batchTriggerV3.server.ts,app/v3/queues.server.ts} : Do NOT add database queries to `triggerTask.server.ts` or `batchTriggerV3.server.ts` hot paths. Task defaults (TTL, etc.) are resolved via `backgroundWorkerTask.findFirst()` in `queues.server.ts` — piggyback on the existing query instead of adding new ones.

Applied to files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
  • apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts
📚 Learning: 2026-03-22T19:34:22.737Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3187
File: apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.errors._index/route.tsx:99-103
Timestamp: 2026-03-22T19:34:22.737Z
Learning: In `apps/webapp/app/presenters/v3/ErrorsListPresenter.server.ts`, the `statuses` filter (UNRESOLVED | RESOLVED | IGNORED) is applied in-memory after the ClickHouse query and a batch Postgres lookup via `getErrorGroupStates`. This is intentional: `status` lives in the Postgres `ErrorGroupState` table, not in ClickHouse, so post-query filtering is the correct approach. Do not flag this as a missing predicate or a no-op filter during code review.

Applied to files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
📚 Learning: 2026-04-07T14:12:59.018Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3331
File: apps/webapp/app/runEngine/concerns/batchPayloads.server.ts:112-136
Timestamp: 2026-04-07T14:12:59.018Z
Learning: In `apps/webapp/app/runEngine/concerns/batchPayloads.server.ts`, the `pRetry` call wrapping `uploadPacketToObjectStore` intentionally retries **all** error types (no `shouldRetry` filter / `AbortError` guards). The maintainer explicitly prefers over-retrying to under-retrying because multiple heterogeneous object store backends are supported and it is impractical to enumerate all permanent error signatures. Do not flag this as an issue in future reviews.

Applied to files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
  • apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts
📚 Learning: 2026-03-22T13:51:25.797Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3244
File: apps/webapp/app/presenters/v3/PromptPresenter.server.ts:100-141
Timestamp: 2026-03-22T13:51:25.797Z
Learning: In the triggerdotdev/trigger.dev codebase, the ClickHouse server is configured with UTC as its timezone. Therefore, `toStartOfHour(start_time)` (without an explicit timezone argument) in ClickHouse queries correctly returns UTC-formatted strings that align with JavaScript `toISOString()`-derived UTC bucket keys (e.g., in `apps/webapp/app/presenters/v3/PromptPresenter.server.ts`). Do not flag this as a timezone mismatch bug.

Applied to files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
📚 Learning: 2026-03-22T13:26:12.060Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3244
File: apps/webapp/app/components/code/TextEditor.tsx:81-86
Timestamp: 2026-03-22T13:26:12.060Z
Learning: In the triggerdotdev/trigger.dev codebase, do not flag `navigator.clipboard.writeText(...)` calls for `missing-await`/`unhandled-promise` issues. These clipboard writes are intentionally invoked without `await` and without `catch` handlers across the project; keep that behavior consistent when reviewing TypeScript/TSX files (e.g., usages like in `apps/webapp/app/components/code/TextEditor.tsx`).

Applied to files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
  • apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts
📚 Learning: 2026-03-22T19:24:14.403Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3187
File: apps/webapp/app/v3/services/alerts/deliverErrorGroupAlert.server.ts:200-204
Timestamp: 2026-03-22T19:24:14.403Z
Learning: In the triggerdotdev/trigger.dev codebase, webhook URLs are not expected to contain embedded credentials/secrets (e.g., fields like `ProjectAlertWebhookProperties` should only hold credential-free webhook endpoints). During code review, if you see logging or inclusion of raw webhook URLs in error messages, do not automatically treat it as a credential-leak/secrets-in-logs issue by default—first verify the URL does not contain embedded credentials (for example, no username/password in the URL, no obvious secret/token query params or fragments). If the URL is credential-free per this project’s conventions, allow the logging.

Applied to files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
  • apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts
📚 Learning: 2026-03-29T19:16:28.864Z
Learnt from: nicktrn
Repo: triggerdotdev/trigger.dev PR: 3291
File: apps/webapp/app/v3/featureFlags.ts:53-65
Timestamp: 2026-03-29T19:16:28.864Z
Learning: When reviewing TypeScript code that uses Zod v3, treat `z.coerce.*()` schemas as their direct Zod type (e.g., `z.coerce.boolean()` returns a `ZodBoolean` with `_def.typeName === "ZodBoolean"`) rather than a `ZodEffects`. Only `.preprocess()`, `.refine()`/`.superRefine()`, and `.transform()` are expected to wrap schemas in `ZodEffects`. Therefore, in reviewers’ logic like `getFlagControlType`, do not flag/unblock failures that require unwrapping `ZodEffects` when the input schema is a `z.coerce.*` schema.

Applied to files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
📚 Learning: 2026-04-15T15:39:22.674Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: apps/webapp/CLAUDE.md:0-0
Timestamp: 2026-04-15T15:39:22.674Z
Learning: Applies to apps/webapp/**/*.server.ts : Always use `findFirst` instead of `findUnique` in Prisma queries. `findUnique` has implicit DataLoader batching with active bugs (uppercase UUIDs returning null, composite key SQL correctness, 5-10x worse performance). `findFirst` avoids these issues.

Applied to files:

  • apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts
📚 Learning: 2026-03-13T13:42:57.512Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3213
File: apps/webapp/app/routes/admin.api.v1.llm-models.ts:52-68
Timestamp: 2026-03-13T13:42:57.512Z
Learning: In `apps/webapp/app/routes/admin.api.v1.llm-models.ts` (and related admin LLM model routes), multiple `LlmPricingTier` entries with the same `priority` under the same model are valid and intentional — tiers are matched in priority order. Duplicate-tier validation via Zod `superRefine` is not warranted as this is admin-only tooling with low concurrency.

Applied to files:

  • apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts
📚 Learning: 2026-04-02T20:25:54.203Z
Learnt from: samejr
Repo: triggerdotdev/trigger.dev PR: 3319
File: apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.models._index/route.tsx:1100-1101
Timestamp: 2026-04-02T20:25:54.203Z
Learning: In `apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.models._index/route.tsx`, the `useFrozenValue(selectedModel)` + `displayModel = selectedModel ?? frozenModel` pattern is intentional. It keeps `ModelDetailPanel` rendered during the collapse animation even after `selectedModel` is cleared. The `key={displayModel.friendlyId}` handles remounting when a different model is selected. Do not flag this as a stale-state or tab-reset issue.

Applied to files:

  • apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts
📚 Learning: 2026-04-01T13:27:35.831Z
Learnt from: samejr
Repo: triggerdotdev/trigger.dev PR: 3308
File: apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.models._index/route.tsx:579-584
Timestamp: 2026-04-01T13:27:35.831Z
Learning: In `apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.models._index/route.tsx`, the `useEffect` inside `CompareDialog` intentionally uses `[open]` as its only dependency. This is a deliberate "fetch-on-open" pattern: the fetcher.load call should fire only when the dialog opens, not on every reference change of `models`, `organization`, `project`, or `environment`. Do not flag this as a missing-dependency lint issue.

Applied to files:

  • apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts
📚 Learning: 2026-03-26T09:02:07.973Z
Learnt from: myftija
Repo: triggerdotdev/trigger.dev PR: 3274
File: apps/webapp/app/services/runsReplicationService.server.ts:922-924
Timestamp: 2026-03-26T09:02:07.973Z
Learning: When parsing Trigger.dev task run annotations in server-side services, keep `TaskRun.annotations` strictly conforming to the `RunAnnotations` schema from `trigger.dev/core/v3`. If the code already uses `RunAnnotations.safeParse` (e.g., in a `#parseAnnotations` helper), treat that as intentional/necessary for atomic, schema-accurate annotation handling. Do not recommend relaxing the annotation payload schema or using a permissive “passthrough” parse path, since the annotations are expected to be written atomically in one operation and should not contain partial/legacy payloads that would require a looser parser.

Applied to files:

  • apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts

Comment on lines +99 to +109
async addDataStore({
key,
kind,
organizationIds,
config,
}: {
key: string;
kind: DataStoreKind;
organizationIds: string[];
config: any;
}) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate write-path config with Zod and remove any.

config is untyped/unvalidated at persistence boundaries, so malformed payloads can be stored and only fail later during reload.

Suggested fix
   async addDataStore({
     key,
     kind,
     organizationIds,
     config,
   }: {
     key: string;
     kind: DataStoreKind;
     organizationIds: string[];
-    config: any;
+    config: unknown;
   }) {
     const secretKey = this.#secretKey(key, kind);

     const secretStore = getSecretStore("DATABASE", { prismaClient: this._prisma });
-    await secretStore.setSecret(secretKey, config);
+    const parsedConfig = ClickhouseConnectionSchema.parse(config);
+    await secretStore.setSecret(secretKey, parsedConfig);
   async updateDataStore({
     key,
     kind,
     organizationIds,
     config,
   }: {
     key: string;
     kind: DataStoreKind;
     organizationIds: string[];
-    config?: any;
+    config?: unknown;
   }) {
     const secretKey = this.#secretKey(key, kind);

-    if (config) {
+    if (config !== undefined) {
       const secretStore = getSecretStore("DATABASE", { prismaClient: this._prisma });
-      await secretStore.setSecret(secretKey, config);
+      const parsedConfig = ClickhouseConnectionSchema.parse(config);
+      await secretStore.setSecret(secretKey, parsedConfig);
     }
As per coding guidelines: "Use zod for validation in packages/core and apps/webapp".

Also applies to: 130-135

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts`
around lines 99 - 109, The addDataStore function currently accepts config: any
and writes unvalidated data; replace the any with a proper typed schema
validated by zod before persistence: import or define a DataStoreConfigSchema
(in packages/core if available) and validate the incoming config inside
addDataStore (and the corresponding updateDataStore code paths referenced around
lines 130-135), throw or return a validation error if schema.parse/parseAsync
fails, and persist the typed result (not the raw any) so only validated config
is stored; update the function signature to use the inferred TypeScript type
from the zod schema instead of any.

Comment on lines +154 to +161
async deleteDataStore({ key, kind }: { key: string; kind: DataStoreKind }) {
const secretKey = this.#secretKey(key, kind);
const secretStore = getSecretStore("DATABASE", { prismaClient: this._prisma });
await secretStore.deleteSecret(secretKey).catch(() => {
// Secret may not exist — proceed with deletion
});

await this._prisma.organizationDataStore.delete({ where: { key } });
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Delete row before secret to reduce partial-failure impact.

Current order removes the secret first. If row deletion fails afterward, the store row remains but now references a missing secret, breaking subsequent reload hydration.

Suggested fix
   async deleteDataStore({ key, kind }: { key: string; kind: DataStoreKind }) {
     const secretKey = this.#secretKey(key, kind);
     const secretStore = getSecretStore("DATABASE", { prismaClient: this._prisma });
-    await secretStore.deleteSecret(secretKey).catch(() => {
-      // Secret may not exist — proceed with deletion
-    });
-
-    await this._prisma.organizationDataStore.delete({ where: { key } });
+    await this._prisma.organizationDataStore.delete({ where: { key } });
+    await secretStore.deleteSecret(secretKey).catch(() => {
+      // Secret may not exist — proceed
+    });
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/services/dataStores/organizationDataStoresRegistry.server.ts`
around lines 154 - 161, Change the operation order in deleteDataStore so the
organizationDataStore row is removed before deleting its associated secret: call
this._prisma.organizationDataStore.delete({ where: { key } }) first, then
compute the secretKey via this.#secretKey(key, kind) and call
secretStore.deleteSecret(secretKey) (using getSecretStore("DATABASE", {
prismaClient: this._prisma }) as before), still catching errors from secret
deletion to tolerate missing secrets.

Comment thread apps/webapp/app/v3/eventRepository/index.server.ts Outdated
Comment on lines +81 to 84
// Support legacy Postgres store for self-hosters
if (taskEventStore === EVENT_STORE_TYPES.POSTGRES) {
return { repository: eventRepository, store: getTaskEventStore() };
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Return the actual store in the Postgres branch.

When taskEventStore === EVENT_STORE_TYPES.POSTGRES, returning store: getTaskEventStore() can advertise ClickHouse if the instance default has moved on. That leaves callers with a Postgres repository and a non-Postgres store token.

🐛 Suggested fix
   // Support legacy Postgres store for self-hosters
   if (taskEventStore === EVENT_STORE_TYPES.POSTGRES) {
-    return { repository: eventRepository, store: getTaskEventStore() };
+    return { repository: eventRepository, store: EVENT_STORE_TYPES.POSTGRES };
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Support legacy Postgres store for self-hosters
if (taskEventStore === EVENT_STORE_TYPES.POSTGRES) {
return { repository: eventRepository, store: getTaskEventStore() };
}
// Support legacy Postgres store for self-hosters
if (taskEventStore === EVENT_STORE_TYPES.POSTGRES) {
return { repository: eventRepository, store: EVENT_STORE_TYPES.POSTGRES };
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/v3/eventRepository/index.server.ts` around lines 81 - 84,
When taskEventStore === EVENT_STORE_TYPES.POSTGRES you must return the actual
Postgres store instead of calling the default getTaskEventStore() which can
return a rotated default (e.g. ClickHouse); change the branch that returns {
repository: eventRepository, store: getTaskEventStore() } to fetch and return
the Postgres-specific store (for example by calling
getTaskEventStore(EVENT_STORE_TYPES.POSTGRES) or the dedicated getter for
Postgres such as getPostgresTaskEventStore()) so the returned store token
matches the Postgres repository; update the branch guarding taskEventStore,
referencing taskEventStore, EVENT_STORE_TYPES.POSTGRES, eventRepository, and
getTaskEventStore() in your change.

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 3 new potential issues.

View 6 additional findings in Devin Review.

Open in Devin Review

Comment on lines 60 to 71
const { repository: resolvedRepository } =
await clickhouseFactory.getEventRepositoryForOrganization(taskEventStore, organizationId);

if (taskEventStore === EVENT_STORE_TYPES.CLICKHOUSE_V2) {
return { repository: clickhouseEventRepositoryV2, store: EVENT_STORE_TYPES.CLICKHOUSE_V2 };
return { repository: resolvedRepository, store: EVENT_STORE_TYPES.CLICKHOUSE_V2 };
}

if (taskEventStore === EVENT_STORE_TYPES.CLICKHOUSE) {
return { repository: clickhouseEventRepository, store: EVENT_STORE_TYPES.CLICKHOUSE };
return { repository: resolvedRepository, store: EVENT_STORE_TYPES.CLICKHOUSE };
}

return { repository: eventRepository, store: EVENT_STORE_TYPES.POSTGRES };
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 getConfiguredEventRepository unconditionally calls factory with "postgres" store, causing a throw

In getConfiguredEventRepository, clickhouseFactory.getEventRepositoryForOrganization(taskEventStore, organizationId) is called unconditionally at line 60-61, before checking the store type. When taskEventStore resolves to "postgres" (the default for self-hosters), buildEventRepository("postgres", ...) is called inside the factory (clickhouseFactory.server.ts:385), which throws "Unknown ClickHouse event repository store: postgres". The Postgres fallback at line 71 is never reached. This breaks the Logs feature for self-hosters using Postgres event stores (called from LogsListPresenter.server.ts:209 and LogDetailPresenter.server.ts:30).

Suggested change
const { repository: resolvedRepository } =
await clickhouseFactory.getEventRepositoryForOrganization(taskEventStore, organizationId);
if (taskEventStore === EVENT_STORE_TYPES.CLICKHOUSE_V2) {
return { repository: clickhouseEventRepositoryV2, store: EVENT_STORE_TYPES.CLICKHOUSE_V2 };
return { repository: resolvedRepository, store: EVENT_STORE_TYPES.CLICKHOUSE_V2 };
}
if (taskEventStore === EVENT_STORE_TYPES.CLICKHOUSE) {
return { repository: clickhouseEventRepository, store: EVENT_STORE_TYPES.CLICKHOUSE };
return { repository: resolvedRepository, store: EVENT_STORE_TYPES.CLICKHOUSE };
}
return { repository: eventRepository, store: EVENT_STORE_TYPES.POSTGRES };
if (taskEventStore === EVENT_STORE_TYPES.CLICKHOUSE_V2) {
const { repository: resolvedRepository } =
await clickhouseFactory.getEventRepositoryForOrganization(taskEventStore, organizationId);
return { repository: resolvedRepository, store: EVENT_STORE_TYPES.CLICKHOUSE_V2 };
}
if (taskEventStore === EVENT_STORE_TYPES.CLICKHOUSE) {
const { repository: resolvedRepository } =
await clickhouseFactory.getEventRepositoryForOrganization(taskEventStore, organizationId);
return { repository: resolvedRepository, store: EVENT_STORE_TYPES.CLICKHOUSE };
}
return { repository: eventRepository, store: EVENT_STORE_TYPES.POSTGRES };
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in de2996d. The factory call is now made only inside the ClickHouse branches, so the Postgres fallback path at the bottom is reached without ever calling buildEventRepository("postgres").

Comment on lines +587 to +589
if (!run.organizationId || !run.environmentType) {
continue;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 RunsReplicationService per-org routing skips runs with null organizationId

In the refactored #flushBatch (runsReplicationService.server.ts:587-589), runs where run.organizationId or run.environmentType is null/undefined are silently skipped (continue). This means those runs are never inserted into ClickHouse. The old code inserted all runs into a single ClickHouse instance regardless of missing fields. While organizationId is required in the Prisma schema (non-nullable), the replication service works with raw WAL data where fields might theoretically be missing during transitions. This behavioral change is worth noting — it's arguably correct (don't replicate malformed data) but could cause silent data loss for edge-case rows.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +126 to +130
const queryClickhouse = await clickhouseFactory.getClickhouseForOrganization(
project.organizationId,
"query"
);
const presenter = new ErrorsListPresenter($replica, queryClickhouse);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚩 ErrorsListPresenter route changed from "logs" to "query" client type

In the errors list route (errors._index/route.tsx:126-130), the old code used logsClickhouseClient but the new code requests clientType "query" from the factory. This changes which ClickHouse instance and settings are used for the errors list queries. The ErrorsListPresenter queries errors.occurrencesListQueryBuilder(), errors.createOccurrencesQueryBuilder(), and errors.listQueryBuilder(). Whether this is intentional or accidental depends on whether the errors table is co-located with the logs or query ClickHouse instances. This is a semantic behavior change from the old code that used the logs client specifically.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted to "logs" in de2996d to preserve the pre-refactor behavior of using logsClickhouseClient for ErrorsListPresenter.

…resenter clients

Co-Authored-By: Matt Aitken <matt@mattaitken.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
apps/webapp/app/v3/eventRepository/index.server.ts (2)

218-222: ⚠️ Potential issue | 🔴 Critical

recordRunEvent can route Postgres runs into the ClickHouse factory path.

For legacy runs where foundRun.taskEventStore is Postgres, Line 218-222 still resolves via clickhouseFactory, which can break event writes on this hot path.

🐛 Proposed fix
-    const { repository: $eventRepository } =
-      await clickhouseFactory.getEventRepositoryForOrganization(
-        foundRun.taskEventStore,
-        foundRun.runtimeEnvironment.organizationId
-      );
+    const { repository: $eventRepository } = await getV3EventRepository(
+      foundRun.runtimeEnvironment.organizationId,
+      foundRun.taskEventStore
+    );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/v3/eventRepository/index.server.ts` around lines 218 - 222,
recordRunEvent is incorrectly routing runs whose foundRun.taskEventStore is
Postgres into clickhouseFactory.getEventRepositoryForOrganization; change the
routing to branch on foundRun.taskEventStore and call the appropriate factory
when resolving the repository (e.g., use
postgresFactory.getEventRepositoryForOrganization or the generic repository
factory for Postgres stores instead of clickhouseFactory). Locate the call to
clickhouseFactory.getEventRepositoryForOrganization and add a guard that checks
foundRun.taskEventStore (or the TaskEventStore enum) and delegates to the
correct factory/method so Postgres runs use the Postgres event repository path.

80-99: ⚠️ Potential issue | 🟠 Major

Postgres fallback returns inconsistent store token and still pre-calls ClickHouse factory.

Line 84 and Line 98 return a Postgres repository but may report a non-Postgres store via getTaskEventStore(). Also, Line 87 calls the ClickHouse factory before fully validating store type, which weakens the default fallback intent for unexpected persisted values.

🐛 Proposed fix
 export async function getEventRepository(
   organizationId: string,
   featureFlags: Record<string, unknown> | undefined,
   parentStore: string | undefined
 ): Promise<{ repository: IEventRepository; store: string }> {
   const taskEventStore = parentStore ?? (await resolveTaskEventRepositoryFlag(featureFlags));
 
-  // Support legacy Postgres store for self-hosters
-  if (taskEventStore === EVENT_STORE_TYPES.POSTGRES) {
-    return { repository: eventRepository, store: getTaskEventStore() };
-  }
-
-  const { repository: resolvedRepository } =
-    await clickhouseFactory.getEventRepositoryForOrganization(taskEventStore, organizationId);
-
   switch (taskEventStore) {
+    case EVENT_STORE_TYPES.POSTGRES: {
+      return { repository: eventRepository, store: EVENT_STORE_TYPES.POSTGRES };
+    }
     case EVENT_STORE_TYPES.CLICKHOUSE_V2: {
+      const { repository: resolvedRepository } =
+        await clickhouseFactory.getEventRepositoryForOrganization(taskEventStore, organizationId);
       return { repository: resolvedRepository, store: EVENT_STORE_TYPES.CLICKHOUSE_V2 };
     }
     case EVENT_STORE_TYPES.CLICKHOUSE: {
+      const { repository: resolvedRepository } =
+        await clickhouseFactory.getEventRepositoryForOrganization(taskEventStore, organizationId);
       return { repository: resolvedRepository, store: EVENT_STORE_TYPES.CLICKHOUSE };
     }
     default: {
-      return { repository: eventRepository, store: getTaskEventStore() };
+      return { repository: eventRepository, store: EVENT_STORE_TYPES.POSTGRES };
     }
   }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/webapp/app/v3/eventRepository/index.server.ts` around lines 80 - 99, The
code prematurely calls clickhouseFactory.getEventRepositoryForOrganization and
can return a mismatched store token; fix by first determining taskEventStore
(using parentStore ?? resolveTaskEventRepositoryFlag(...)) and if taskEventStore
=== EVENT_STORE_TYPES.POSTGRES immediately return { repository: eventRepository,
store: EVENT_STORE_TYPES.POSTGRES } without calling
clickhouseFactory.getEventRepositoryForOrganization; only call
clickhouseFactory.getEventRepositoryForOrganization when taskEventStore is not
POSTGRES, and in the default fallback return use getTaskEventStore() or
EVENT_STORE_TYPES.POSTGRES consistently so returned store token matches the
repository (refer to taskEventStore, EVENT_STORE_TYPES.POSTGRES,
clickhouseFactory.getEventRepositoryForOrganization, getTaskEventStore,
eventRepository, organizationId).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@apps/webapp/app/v3/eventRepository/index.server.ts`:
- Around line 218-222: recordRunEvent is incorrectly routing runs whose
foundRun.taskEventStore is Postgres into
clickhouseFactory.getEventRepositoryForOrganization; change the routing to
branch on foundRun.taskEventStore and call the appropriate factory when
resolving the repository (e.g., use
postgresFactory.getEventRepositoryForOrganization or the generic repository
factory for Postgres stores instead of clickhouseFactory). Locate the call to
clickhouseFactory.getEventRepositoryForOrganization and add a guard that checks
foundRun.taskEventStore (or the TaskEventStore enum) and delegates to the
correct factory/method so Postgres runs use the Postgres event repository path.
- Around line 80-99: The code prematurely calls
clickhouseFactory.getEventRepositoryForOrganization and can return a mismatched
store token; fix by first determining taskEventStore (using parentStore ??
resolveTaskEventRepositoryFlag(...)) and if taskEventStore ===
EVENT_STORE_TYPES.POSTGRES immediately return { repository: eventRepository,
store: EVENT_STORE_TYPES.POSTGRES } without calling
clickhouseFactory.getEventRepositoryForOrganization; only call
clickhouseFactory.getEventRepositoryForOrganization when taskEventStore is not
POSTGRES, and in the default fallback return use getTaskEventStore() or
EVENT_STORE_TYPES.POSTGRES consistently so returned store token matches the
repository (refer to taskEventStore, EVENT_STORE_TYPES.POSTGRES,
clickhouseFactory.getEventRepositoryForOrganization, getTaskEventStore,
eventRepository, organizationId).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: fcb3eb53-e928-4269-b5f1-e7983bb980be

📥 Commits

Reviewing files that changed from the base of the PR and between c5f7ca3 and de2996d.

📒 Files selected for processing (3)
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.errors.$fingerprint/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.errors._index/route.tsx
  • apps/webapp/app/v3/eventRepository/index.server.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.errors._index/route.tsx
  • apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.errors.$fingerprint/route.tsx
📜 Review details
🧰 Additional context used
📓 Path-based instructions (8)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects instead

Files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
{packages/core,apps/webapp}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use zod for validation in packages/core and apps/webapp

Files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use function declarations instead of default exports

Import from packages/core using subpaths only, never from the root. Example: @trigger.dev/core/subpath instead of @trigger.dev/core.

Add crumbs as you write code using // @Crumbs comments or `// `#region` `@crumbs blocks. They stay on the branch throughout development and must be stripped by agentcrumbs strip before merge.

Files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/otel-metrics.mdc)

**/*.ts: When creating or editing OTEL metrics (counters, histograms, gauges), ensure metric attributes have low cardinality by using only enums, booleans, bounded error codes, or bounded shard IDs
Do not use high-cardinality attributes in OTEL metrics such as UUIDs/IDs (envId, userId, runId, projectId, organizationId), unbounded integers (itemCount, batchSize, retryCount), timestamps (createdAt, startTime), or free-form strings (errorMessage, taskName, queueName)
When exporting OTEL metrics via OTLP to Prometheus, be aware that the exporter automatically adds unit suffixes to metric names (e.g., 'my_duration_ms' becomes 'my_duration_ms_milliseconds', 'my_counter' becomes 'my_counter_total'). Account for these transformations when writing Grafana dashboards or Prometheus queries

Files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
**/*.{js,ts,jsx,tsx,json,md,yaml,yml}

📄 CodeRabbit inference engine (AGENTS.md)

Format code using Prettier before committing

Files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
apps/webapp/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

apps/webapp/**/*.{ts,tsx}: Access environment variables through the env export of env.server.ts instead of directly accessing process.env
Use subpath exports from @trigger.dev/core package instead of importing from the root @trigger.dev/core path

Use named constants for sentinel/placeholder values (e.g. const UNSET_VALUE = '__unset__') instead of raw string literals scattered across comparisons

Files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
apps/webapp/**/*.server.ts

📄 CodeRabbit inference engine (apps/webapp/CLAUDE.md)

apps/webapp/**/*.server.ts: Never use request.signal for detecting client disconnects. Use getRequestAbortSignal() from app/services/httpAsyncStorage.server.ts instead, which is wired directly to Express res.on('close') and fires reliably
Access environment variables via env export from app/env.server.ts. Never use process.env directly
Always use findFirst instead of findUnique in Prisma queries. findUnique has an implicit DataLoader that batches concurrent calls and has active bugs even in Prisma 6.x (uppercase UUIDs returning null, composite key SQL correctness issues, 5-10x worse performance). findFirst is never batched and avoids this entire class of issues

Files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
**/*.{js,jsx,ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Prefer static imports over dynamic imports. Only use dynamic import() when circular dependencies cannot be resolved otherwise, code splitting is genuinely needed for performance, or the module must be loaded conditionally at runtime.

Files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
🧠 Learnings (16)
📓 Common learnings
Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-16T16:43:43.344Z
Learning: For apps and internal packages in `apps/*` and `internal-packages/*`, use `typecheck` for verification. Never use `build` - building proves almost nothing about correctness.
Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-16T16:43:43.344Z
Learning: For public packages in `packages/*`, use `build` for verification.
Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-16T16:43:43.344Z
Learning: Add a changeset when modifying any public package (`packages/*` or `integrations/*`). Default to patch for bug fixes and minor changes. Confirm with maintainers before selecting minor (new features). Never select major without explicit approval.
Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-16T16:43:43.344Z
Learning: When modifying only server components (`apps/webapp/`, `apps/supervisor/`, etc.) with no package changes, add a `.server-changes/` file instead of a changeset.
Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-16T16:43:43.344Z
Learning: Zod must be pinned to a single version across the entire monorepo (currently `3.25.76`). When adding zod to a new or existing package, use the exact same version as the rest of the repo - never a different version or a range.
Learnt from: CR
URL: 
File: CLAUDE.md:undefined-undefined
Timestamp: 2026-04-16T16:43:43.344Z
Learning: Only modify V2 code paths when encountering V1/V2 branching in services. Legacy V1-only code includes MarQS queue, triggerTaskV1, cancelTaskRunV1. All new work uses Run Engine 2.0 (`internal/run-engine`) and redis-worker.
📚 Learning: 2025-07-12T18:06:04.133Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 2264
File: apps/webapp/app/services/runsRepository.server.ts:172-174
Timestamp: 2025-07-12T18:06:04.133Z
Learning: In apps/webapp/app/services/runsRepository.server.ts, the in-memory status filtering after fetching runs from Prisma is intentionally used as a workaround for ClickHouse data delays. This approach is acceptable because the result set is limited to a maximum of 100 runs due to pagination, making the performance impact negligible.

Applied to files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
📚 Learning: 2026-03-26T10:02:25.354Z
Learnt from: 0ski
Repo: triggerdotdev/trigger.dev PR: 3254
File: apps/webapp/app/services/platformNotifications.server.ts:363-385
Timestamp: 2026-03-26T10:02:25.354Z
Learning: In `triggerdotdev/trigger.dev`, the `getNextCliNotification` fallback in `apps/webapp/app/services/platformNotifications.server.ts` intentionally uses `prisma.orgMember.findFirst` (single org) when no `projectRef` is provided. This is acceptable for v1 because the CLI (`dev` and `login` commands) always passes `projectRef` in normal usage, making the fallback a rare edge case. Do not flag the single-org fallback as a multi-org correctness bug in this file.

Applied to files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
📚 Learning: 2026-04-13T21:44:00.032Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3368
File: apps/webapp/app/services/taskIdentifierRegistry.server.ts:24-67
Timestamp: 2026-04-13T21:44:00.032Z
Learning: In `apps/webapp/app/services/taskIdentifierRegistry.server.ts`, the sequential upsert/updateMany/findMany writes in `syncTaskIdentifiers` are intentionally NOT wrapped in a Prisma transaction. This function runs only during deployment-change events (low-concurrency path), and any partial `isInLatestDeployment` state is acceptable because it self-corrects on the next deployment. Do not flag this as a missing-transaction/atomicity issue in future reviews.

Applied to files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
📚 Learning: 2026-04-16T14:21:14.907Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3368
File: internal-packages/database/prisma/schema.prisma:666-666
Timestamp: 2026-04-16T14:21:14.907Z
Learning: In `triggerdotdev/trigger.dev`, the `BackgroundWorkerTask` covering index on `(runtimeEnvironmentId, slug, triggerSource)` lives in `internal-packages/database/prisma/migrations/20260413000000_add_bwt_covering_index/migration.sql` as a `CREATE INDEX CONCURRENTLY IF NOT EXISTS`, intentionally in its own migration file separate from the `TaskIdentifier` table migration. Do not flag this index as missing from the schema migrations in future reviews.

Applied to files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
📚 Learning: 2026-04-16T13:45:18.782Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3368
File: apps/webapp/test/engine/taskIdentifierRegistry.test.ts:3-19
Timestamp: 2026-04-16T13:45:18.782Z
Learning: In `apps/webapp/test/engine/taskIdentifierRegistry.test.ts`, the `vi.mock` calls for `~/services/taskIdentifierCache.server` (stubbing `getTaskIdentifiersFromCache` and `populateTaskIdentifierCache`), `~/models/task.server` (stubbing `getAllTaskIdentifiers`), and `~/db.server` (stubbing `prisma` and `$replica`) are intentional. The suite uses real Postgres via testcontainers for all `TaskIdentifier` DB operations, but isolates the Redis cache layer and legacy query fallback as separate concerns not exercised in this test file. Do not flag these mocks as violations of the no-mocks policy in future reviews.

Applied to files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
📚 Learning: 2026-04-16T14:21:15.229Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3368
File: apps/webapp/app/components/logs/LogsTaskFilter.tsx:135-163
Timestamp: 2026-04-16T14:21:15.229Z
Learning: In `triggerdotdev/trigger.dev` PR `#3368`, the `TaskIdentifier` table has a `@unique([runtimeEnvironmentId, slug])` DB constraint, guaranteeing one row per (environment, slug). In components like `apps/webapp/app/components/logs/LogsTaskFilter.tsx` and `apps/webapp/app/components/runs/v3/RunFilters.tsx`, using `key={item.slug}` for SelectItem list items is correct and unique. Do NOT flag `key={item.slug}` as potentially non-unique — the old duplicate-(slug, triggerSource) issue only existed with the legacy `DISTINCT` query, which this registry replaces.

Applied to files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
📚 Learning: 2026-04-13T21:44:00.644Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3368
File: apps/webapp/app/services/taskIdentifierRegistry.server.ts:49-58
Timestamp: 2026-04-13T21:44:00.644Z
Learning: In `triggerdotdev/trigger.dev`, a deployment with zero tasks is not a realistic scenario in practice. Do not flag missing handling for empty-task deployments in `apps/webapp/app/services/taskIdentifierRegistry.server.ts` or related registry/sync logic.

Applied to files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
📚 Learning: 2026-03-22T19:34:22.737Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3187
File: apps/webapp/app/routes/_app.orgs.$organizationSlug.projects.$projectParam.env.$envParam.errors._index/route.tsx:99-103
Timestamp: 2026-03-22T19:34:22.737Z
Learning: In `apps/webapp/app/presenters/v3/ErrorsListPresenter.server.ts`, the `statuses` filter (UNRESOLVED | RESOLVED | IGNORED) is applied in-memory after the ClickHouse query and a batch Postgres lookup via `getErrorGroupStates`. This is intentional: `status` lives in the Postgres `ErrorGroupState` table, not in ClickHouse, so post-query filtering is the correct approach. Do not flag this as a missing predicate or a no-op filter during code review.

Applied to files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
📚 Learning: 2026-04-15T15:39:06.868Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-15T15:39:06.868Z
Learning: For V1/V2 branching in services within `apps/webapp/app/v3/`, only modify V2 code paths. All new work uses Run Engine 2.0 (`internal/run-engine`) and redis-worker, not legacy V1 code (MarQS queue, triggerTaskV1, cancelTaskRunV1, etc.).

Applied to files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
📚 Learning: 2026-04-16T14:19:16.309Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: apps/webapp/CLAUDE.md:0-0
Timestamp: 2026-04-16T14:19:16.309Z
Learning: Applies to apps/webapp/{app/v3/services/triggerTask.server.ts,app/v3/services/batchTriggerV3.server.ts} : In `triggerTask.server.ts` and `batchTriggerV3.server.ts`, do NOT add database queries. Task defaults (TTL, etc.) are resolved via `backgroundWorkerTask.findFirst()` in the queue concern (`queues.server.ts`). Piggyback on the existing query instead of adding new ones

Applied to files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
📚 Learning: 2026-04-16T14:19:16.309Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: apps/webapp/CLAUDE.md:0-0
Timestamp: 2026-04-16T14:19:16.309Z
Learning: Applies to apps/webapp/app/v3/services/{cancelTaskRun,batchTriggerV3}.server.ts : When editing services that branch on `RunEngineVersion` to support both V1 and V2 (e.g., `cancelTaskRun.server.ts`, `batchTriggerV3.server.ts`), only modify V2 code paths

Applied to files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
📚 Learning: 2026-04-07T14:12:59.018Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3331
File: apps/webapp/app/runEngine/concerns/batchPayloads.server.ts:112-136
Timestamp: 2026-04-07T14:12:59.018Z
Learning: In `apps/webapp/app/runEngine/concerns/batchPayloads.server.ts`, the `pRetry` call wrapping `uploadPacketToObjectStore` intentionally retries **all** error types (no `shouldRetry` filter / `AbortError` guards). The maintainer explicitly prefers over-retrying to under-retrying because multiple heterogeneous object store backends are supported and it is impractical to enumerate all permanent error signatures. Do not flag this as an issue in future reviews.

Applied to files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
📚 Learning: 2026-03-22T13:26:12.060Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3244
File: apps/webapp/app/components/code/TextEditor.tsx:81-86
Timestamp: 2026-03-22T13:26:12.060Z
Learning: In the triggerdotdev/trigger.dev codebase, do not flag `navigator.clipboard.writeText(...)` calls for `missing-await`/`unhandled-promise` issues. These clipboard writes are intentionally invoked without `await` and without `catch` handlers across the project; keep that behavior consistent when reviewing TypeScript/TSX files (e.g., usages like in `apps/webapp/app/components/code/TextEditor.tsx`).

Applied to files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
📚 Learning: 2026-03-22T19:24:14.403Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3187
File: apps/webapp/app/v3/services/alerts/deliverErrorGroupAlert.server.ts:200-204
Timestamp: 2026-03-22T19:24:14.403Z
Learning: In the triggerdotdev/trigger.dev codebase, webhook URLs are not expected to contain embedded credentials/secrets (e.g., fields like `ProjectAlertWebhookProperties` should only hold credential-free webhook endpoints). During code review, if you see logging or inclusion of raw webhook URLs in error messages, do not automatically treat it as a credential-leak/secrets-in-logs issue by default—first verify the URL does not contain embedded credentials (for example, no username/password in the URL, no obvious secret/token query params or fragments). If the URL is credential-free per this project’s conventions, allow the logging.

Applied to files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
📚 Learning: 2026-03-29T19:16:28.864Z
Learnt from: nicktrn
Repo: triggerdotdev/trigger.dev PR: 3291
File: apps/webapp/app/v3/featureFlags.ts:53-65
Timestamp: 2026-03-29T19:16:28.864Z
Learning: When reviewing TypeScript code that uses Zod v3, treat `z.coerce.*()` schemas as their direct Zod type (e.g., `z.coerce.boolean()` returns a `ZodBoolean` with `_def.typeName === "ZodBoolean"`) rather than a `ZodEffects`. Only `.preprocess()`, `.refine()`/`.superRefine()`, and `.transform()` are expected to wrap schemas in `ZodEffects`. Therefore, in reviewers’ logic like `getFlagControlType`, do not flag/unblock failures that require unwrapping `ZodEffects` when the input schema is a `z.coerce.*` schema.

Applied to files:

  • apps/webapp/app/v3/eventRepository/index.server.ts
🔇 Additional comments (1)
apps/webapp/app/v3/eventRepository/index.server.ts (1)

24-33: Good extraction for org-scoped repository resolution.

resolveEventRepositoryForStore() cleanly centralizes org-aware ClickHouse lookup with Prisma fallback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant