feat: Sessions - bidirectional durable agent streams by ericallam · Pull Request #3417 · triggerdotdev/trigger.dev

ericallam · 2026-04-20T14:18:13Z

What this enables

A new first-class primitive, Session, for durable bidirectional I/O that outlives a single run. Sessions give you a server-managed channel pair (.out from the task, .in from the client) that you can write to, read from, and subscribe to across many runs, filter, list, and close, all through a single identifier.

Use cases unblocked

Chat agents that persist across turns. Turns 1..N attach to the same Session. The UI subscribes once and keeps receiving output as new runs attach.
Approval loops and long-running tasks with user feedback. The task waits on .in, the client writes to .in, and the server enforces no-writes-after-close.
Workflow progress streams that live past the run. A dashboard can subscribe to .out after the task finishes to replay the history.
Any session-scoped state where pre-existing run streams (scoped to a single run) were too narrow.

Public API surface

Control plane

POST /api/v1/sessions to create. Idempotent when you supply externalId.
GET /api/v1/sessions/:session to retrieve by friendlyId (session_abc) or by your own externalId. The server disambiguates via the session_ prefix.
GET /api/v1/sessions to list with filters (type, tag, taskIdentifier, externalId, derived status = ACTIVE/CLOSED/EXPIRED, created-at period/from/to) and cursor pagination. Backed by ClickHouse.
PATCH /api/v1/sessions/:session to update tags/metadata/externalId.
POST /api/v1/sessions/:session/close to terminate. Idempotent, hard-blocks new server-brokered writes.

Realtime

PUT /realtime/v1/sessions/:session/:io to initialize a channel. Returns S2 credentials in headers so clients can write direct to S2 for high-throughput cases.
GET /realtime/v1/sessions/:session/:io for SSE subscribe.
POST /realtime/v1/sessions/:session/:io/append for server-side appends.

Scopes

sessions is now a ResourceType. read:sessions:{id}, write:sessions:{id}, admin:sessions:{id} all flow through the existing JWT validator.

Implementation summary

Postgres (`Session` table)

Scalar scoping columns (projectId, runtimeEnvironmentId, environmentType, organizationId) with no foreign keys. Matches the January TaskRun FK-removal decision, keeps the write path partition-friendly.
Point-lookup indexes only: friendlyId unique, (env, externalId) unique, expiresAt. List queries are served from ClickHouse, so Postgres stays insert-heavy.
Terminal markers (closedAt, closedReason, expiresAt) are write-once. No status enum, no counters, no currentRunId pointer. All run-related state is derived.

ClickHouse (`sessions_v1`)

ReplacingMergeTree partitioned by month, ordered by (org_id, project_id, environment_id, created_at, session_id). tags indexed with a tokenbf_v1 skip index.
SessionsReplicationService mirrors RunsReplicationService exactly: logical replication with leader-locked consumer, ConcurrentFlushScheduler, retry with exponential backoff + jitter, identical metric shape. Dedicated slot + publication so the two consume independently.
SessionsRepository + ClickHouseSessionsRepository expose list / count / tags with the same cursor pagination convention as runs and waitpoints.

S2

New key format for session channels: sessions/{friendlyId}/{out|in}. The existing runs/{runId}/{streamId} format for implicit run streams is completely untouched.

What did not change

Run-scoped streams.pipe / streams.input still behave exactly as before. They do not create Session rows and the existing routes are unchanged. Sessions are a net-new primitive for the next phase of agent features, not a reshaping of the current streams API.

Verification

Webapp typecheck clean (10/10).
apps/webapp/test/sessionsReplicationService.test.ts exercises insert and update round-trips through Postgres logical replication into ClickHouse via testcontainers.
Live end-to-end against local dev: create, retrieve (friendlyId + externalId), update, .out.initialize, .out.append x2, .in.send, .out.subscribe over SSE, list (type, tag, status, externalId, pagination), close, idempotent re-close. Replicated row lands in ClickHouse within ~1s with closed_reason intact.

Not in this PR

Client SDK (lives on the ai-chat feature branch, wires up the runtime ergonomics for chat.agent).
Dashboard routes.
chat.agent integration.

Test plan

pnpm run typecheck --filter webapp
pnpm run test --filter webapp ./test/sessionsReplicationService.test.ts --run
Start the webapp with SESSION_REPLICATION_CLICKHOUSE_URL and SESSION_REPLICATION_ENABLED=1 set. Confirm the slot and publication auto-create on boot.
Hit POST /api/v1/sessions and verify the row replicates to trigger_dev.sessions_v1 within a couple of seconds.
POST /api/v1/sessions/:id/close and confirm subsequent POST /realtime/v1/sessions/:id/out/append returns 400.

Durable, typed, bidirectional I/O primitive that outlives a single run. Ship target is agent/chat use cases; run-scoped streams.pipe/streams.input are untouched and do not create Session rows. Postgres - New Session table: id, friendlyId, externalId, type (plain string), denormalised project/environment/organization scalar columns (no FKs), taskIdentifier, tags String[], metadata Json, closedAt, closedReason, expiresAt, timestamps - Point-lookup indexes only (friendlyId unique, (env, externalId) unique, expiresAt). List queries are served from ClickHouse so Postgres stays minimal and insert-heavy. Control-plane API - POST /api/v1/sessions create (idempotent via externalId) - GET /api/v1/sessions list with filters (type, tag, taskIdentifier, externalId, status ACTIVE|CLOSED|EXPIRED, period/from/to) and cursor pagination, ClickHouse-backed - GET /api/v1/sessions/:session retrieve — polymorphic: `session_` prefix hits friendlyId, otherwise externalId - PATCH /api/v1/sessions/:session update tags/metadata/externalId - POST /api/v1/sessions/:session/close terminal close (idempotent) Realtime (S2-backed) - PUT /realtime/v1/sessions/:session/:io returns S2 creds - GET /realtime/v1/sessions/:session/:io SSE subscribe - POST /realtime/v1/sessions/:session/:io/append server-side append - S2 key format: sessions/{friendlyId}/{out|in} Auth - sessions added to ResourceTypes. read:sessions:{id}, write:sessions:{id}, admin:sessions:{id} scopes work via existing JWT validation. ClickHouse - sessions_v1 ReplacingMergeTree table - SessionsReplicationService mirrors RunsReplicationService exactly: logical replication with leader-locked consumer, ConcurrentFlushScheduler, retry with exponential backoff + jitter, identical metric shape. Dedicated slot + publication (sessions_to_clickhouse_v1[_publication]). - SessionsRepository + ClickHouseSessionsRepository expose list, count, tags with cursor pagination keyed by (created_at DESC, session_id DESC). - Derived status (ACTIVE/CLOSED/EXPIRED) computed from closed_at + expires_at; in-memory fallback on list results to catch pre-replication writes. Verification - Webapp typecheck 10/10 - Core + SDK build 3/3 - sessionsReplicationService.test.ts integration tests 2/2 (insert + update round-trip via testcontainers) - Live round-trip against local dev: create -> retrieve (friendlyId and externalId) -> out.initialize -> out.append x2 -> in.send -> out.subscribe (receives records) -> close -> ClickHouse sessions_v1 shows the replicated row with closed_reason - Live list smoke: tag, type, status CLOSED, externalId, and cursor pagination

changeset-bot · 2026-04-20T14:18:22Z

🦋 Changeset detected

Latest commit: 2210fe2

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 29 packages

Name	Type
@trigger.dev/core	Patch
@trigger.dev/build	Patch
trigger.dev	Patch
@trigger.dev/python	Patch
@trigger.dev/redis-worker	Patch
@trigger.dev/schema-to-json	Patch
@trigger.dev/sdk	Patch
@internal/cache	Patch
@internal/clickhouse	Patch
@internal/llm-model-catalog	Patch
@internal/redis	Patch
@internal/replication	Patch
@internal/run-engine	Patch
@internal/schedule-engine	Patch
@internal/testcontainers	Patch
@internal/tracing	Patch
@internal/tsql	Patch
@internal/zod-worker	Patch
d3-chat	Patch
references-d3-openai-agents	Patch
references-nextjs-realtime	Patch
references-realtime-hooks-test	Patch
references-realtime-streams	Patch
references-telemetry	Patch
@internal/sdk-compat-tests	Patch
@trigger.dev/react-hooks	Patch
@trigger.dev/rsc	Patch
@trigger.dev/database	Patch
@trigger.dev/otlp-importer	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

coderabbitai · 2026-04-20T14:18:33Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 47bb187a-c4ac-4051-bb95-66de4f297537

📥 Commits

Reviewing files that changed from the base of the PR and between ff46f33 and 2210fe2.

📒 Files selected for processing (1)

apps/webapp/app/routes/api.v1.sessions.ts

📜 Recent review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (28)

GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
GitHub Check: typecheck / typecheck
GitHub Check: sdk-compat / Node.js 22.12 (ubuntu-latest)
GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
GitHub Check: sdk-compat / Cloudflare Workers
GitHub Check: sdk-compat / Node.js 20.20 (ubuntu-latest)
GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
GitHub Check: sdk-compat / Bun Runtime
GitHub Check: sdk-compat / Deno Runtime
GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
GitHub Check: Analyze (javascript-typescript)

🧰 Additional context used

📓 Path-based instructions (7)

**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects instead

Files:

apps/webapp/app/routes/api.v1.sessions.ts

{packages/core,apps/webapp}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use zod for validation in packages/core and apps/webapp

Files:

apps/webapp/app/routes/api.v1.sessions.ts

**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use function declarations instead of default exports

Add crumbs as you write code using // @Crumbs comments or `// `#region` `@crumbs blocks. These are temporary debug instrumentation and must be stripped using agentcrumbs strip before merge.

Files:

apps/webapp/app/routes/api.v1.sessions.ts

**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/otel-metrics.mdc)

**/*.ts: When creating or editing OTEL metrics (counters, histograms, gauges), ensure metric attributes have low cardinality by using only enums, booleans, bounded error codes, or bounded shard IDs
Do not use high-cardinality attributes in OTEL metrics such as UUIDs/IDs (envId, userId, runId, projectId, organizationId), unbounded integers (itemCount, batchSize, retryCount), timestamps (createdAt, startTime), or free-form strings (errorMessage, taskName, queueName)
When exporting OTEL metrics via OTLP to Prometheus, be aware that the exporter automatically adds unit suffixes to metric names (e.g., 'my_duration_ms' becomes 'my_duration_ms_milliseconds', 'my_counter' becomes 'my_counter_total'). Account for these transformations when writing Grafana dashboards or Prometheus queries

Files:

apps/webapp/app/routes/api.v1.sessions.ts

**/*.{js,ts,jsx,tsx,json,md,yaml,yml}

📄 CodeRabbit inference engine (AGENTS.md)

Format code using Prettier before committing

Files:

apps/webapp/app/routes/api.v1.sessions.ts

**/*.ts{,x}

📄 CodeRabbit inference engine (CLAUDE.md)

Always import from @trigger.dev/sdk when writing Trigger.dev tasks. Never use @trigger.dev/sdk/v3 or deprecated client.defineJob.

Files:

apps/webapp/app/routes/api.v1.sessions.ts

apps/webapp/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

apps/webapp/**/*.{ts,tsx}: Access environment variables through the env export of env.server.ts instead of directly accessing process.env
Use subpath exports from @trigger.dev/core package instead of importing from the root @trigger.dev/core path

Use named constants for sentinel/placeholder values (e.g. const UNSET_VALUE = '__unset__') instead of raw string literals scattered across comparisons

Files:

apps/webapp/app/routes/api.v1.sessions.ts

🧠 Learnings (23)

📓 Common learnings

Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3417
File: apps/webapp/app/services/sessionsReplicationService.server.ts:224-231
Timestamp: 2026-04-20T14:50:16.440Z
Learning: In `apps/webapp/app/services/sessionsReplicationService.server.ts`, the acknowledge-before-flush pattern is intentional and mirrors `runsReplicationService.server.ts`. `_latestCommitEndLsn` is updated at Postgres commit time and acknowledged on a periodic interval via `#acknowledgeLatestTransaction`, without waiting for ClickHouse batch flush to complete. Do not flag this as a durability/ordering issue — this at-least-once delivery trade-off is an established project-wide convention for both runs and sessions replication services.

Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3417
File: apps/webapp/app/routes/realtime.v1.sessions.$session.$io.ts:37-51
Timestamp: 2026-04-20T15:06:16.910Z
Learning: In `apps/webapp/app/routes/realtime.v1.sessions.$session.$io.ts` (and all session realtime read paths), `$replica` is intentionally used for the `resolveSessionByIdOrExternalId` call — including the `closedAt` guard in the PUT/initialize path. The project convention is to use `$replica` consistently across all session realtime routes. The race window (replica lag allowing a ghost-initialize after close) is accepted as not realistic in practice (clients follow the close API response; they do not race it). If replica lag ever causes issues, the mitigation is to revisit all realtime routes together, not to swap individual routes to `prisma`. Do not flag `$replica` usage in session realtime routes as a stale-read issue.

Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3417
File: apps/webapp/app/services/sessionsReplicationService.server.ts:204-215
Timestamp: 2026-04-20T15:08:49.959Z
Learning: In `apps/webapp/app/services/sessionsReplicationService.server.ts` and `apps/webapp/app/services/runsReplicationService.server.ts`, the `getKey` function in `ConcurrentFlushScheduler` uses `${item.event}_${item.session.id}` / `${item.event}_${item.run.id}` respectively. This pattern is intentionally kept identical across both replication services for consistency. Any change to the deduplication key shape (e.g., keying solely by session/run id) must be applied to both services together, never to one service in isolation. Tracking as a cross-service follow-up.

Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3417
File: apps/webapp/app/services/sessionsRepository/clickhouseSessionsRepository.server.ts:27-40
Timestamp: 2026-04-20T15:08:57.551Z
Learning: In `apps/webapp/app/services/sessionsRepository/clickhouseSessionsRepository.server.ts`, the cursor predicate in `listSessionIds` compares only `session_id` while the `ORDER BY` clause uses `(created_at, session_id)`. This is intentional and consistent with the same pattern in `ClickHouseRunsRepository` and the waitpoints repository. Do not flag this as a skip/duplicate pagination bug in isolation — any fix must land across all three repositories at once as a shared follow-up.

Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3417
File: internal-packages/clickhouse/src/sessions.ts:174-180
Timestamp: 2026-04-20T15:09:08.656Z
Learning: In `internal-packages/clickhouse/src/sessions.ts`, `getSessionTagsQueryBuilder` intentionally queries `trigger_dev.sessions_v1` WITHOUT `FINAL`, mirroring `getTaskRunTagsQueryBuilder` which queries `task_runs_v2` without `FINAL`. The DISTINCT arrayJoin tag-listing read can tolerate an occasional stale tag from a superseded ReplacingMergeTree row; the FINAL cost on a large table is considered not worth it. If FINAL is ever added, both tag query builders (sessions and runs) will be updated together. Do not flag the missing FINAL in either tag query builder as a consistency or stale-data issue.

📚 Learning: 2026-04-20T15:06:16.910Z

Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3417
File: apps/webapp/app/routes/realtime.v1.sessions.$session.$io.ts:37-51
Timestamp: 2026-04-20T15:06:16.910Z
Learning: In `apps/webapp/app/routes/realtime.v1.sessions.$session.$io.ts` (and all session realtime read paths), `$replica` is intentionally used for the `resolveSessionByIdOrExternalId` call — including the `closedAt` guard in the PUT/initialize path. The project convention is to use `$replica` consistently across all session realtime routes. The race window (replica lag allowing a ghost-initialize after close) is accepted as not realistic in practice (clients follow the close API response; they do not race it). If replica lag ever causes issues, the mitigation is to revisit all realtime routes together, not to swap individual routes to `prisma`. Do not flag `$replica` usage in session realtime routes as a stale-read issue.

Applied to files:

apps/webapp/app/routes/api.v1.sessions.ts

📚 Learning: 2026-04-20T15:08:57.551Z

Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3417
File: apps/webapp/app/services/sessionsRepository/clickhouseSessionsRepository.server.ts:27-40
Timestamp: 2026-04-20T15:08:57.551Z
Learning: In `apps/webapp/app/services/sessionsRepository/clickhouseSessionsRepository.server.ts`, the cursor predicate in `listSessionIds` compares only `session_id` while the `ORDER BY` clause uses `(created_at, session_id)`. This is intentional and consistent with the same pattern in `ClickHouseRunsRepository` and the waitpoints repository. Do not flag this as a skip/duplicate pagination bug in isolation — any fix must land across all three repositories at once as a shared follow-up.

Applied to files:

apps/webapp/app/routes/api.v1.sessions.ts

📚 Learning: 2026-04-20T15:08:49.959Z

Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3417
File: apps/webapp/app/services/sessionsReplicationService.server.ts:204-215
Timestamp: 2026-04-20T15:08:49.959Z
Learning: In `apps/webapp/app/services/sessionsReplicationService.server.ts` and `apps/webapp/app/services/runsReplicationService.server.ts`, the `getKey` function in `ConcurrentFlushScheduler` uses `${item.event}_${item.session.id}` / `${item.event}_${item.run.id}` respectively. This pattern is intentionally kept identical across both replication services for consistency. Any change to the deduplication key shape (e.g., keying solely by session/run id) must be applied to both services together, never to one service in isolation. Tracking as a cross-service follow-up.

Applied to files:

apps/webapp/app/routes/api.v1.sessions.ts

📚 Learning: 2026-04-20T15:05:57.327Z

Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3417
File: apps/webapp/app/routes/realtime.v1.sessions.$session.$io.append.ts:20-31
Timestamp: 2026-04-20T15:05:57.327Z
Learning: In `apps/webapp/app/routes/realtime.v1.sessions.$session.$io.append.ts`, the `MAX_APPEND_BODY_BYTES` cap is intentionally set to `1024 * 512` (512 KiB). The maintainer explicitly decided against lowering it to 128 KiB: the all-quotes worst-case JSON-escaping expansion that could exceed S2's 1 MiB per-record limit is considered pathological and not representative of real-world payloads (chat tokens, tool-call JSON, structured data). If overflow becomes a problem in practice, the preferred mitigation is an encoded-size guard inside `appendPart` itself. Do not flag this cap as a potential S2 overflow issue in future reviews.

Applied to files:

apps/webapp/app/routes/api.v1.sessions.ts

📚 Learning: 2026-04-16T14:19:16.309Z

Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: apps/webapp/CLAUDE.md:0-0
Timestamp: 2026-04-16T14:19:16.309Z
Learning: Applies to apps/webapp/**/*.server.ts : Always use `findFirst` instead of `findUnique` in Prisma queries. `findUnique` has an implicit DataLoader that batches concurrent calls and has active bugs even in Prisma 6.x (uppercase UUIDs returning null, composite key SQL correctness issues, 5-10x worse performance). `findFirst` is never batched and avoids this entire class of issues

Applied to files:

apps/webapp/app/routes/api.v1.sessions.ts

📚 Learning: 2026-04-13T21:44:00.032Z

Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3368
File: apps/webapp/app/services/taskIdentifierRegistry.server.ts:24-67
Timestamp: 2026-04-13T21:44:00.032Z
Learning: In `apps/webapp/app/services/taskIdentifierRegistry.server.ts`, the sequential upsert/updateMany/findMany writes in `syncTaskIdentifiers` are intentionally NOT wrapped in a Prisma transaction. This function runs only during deployment-change events (low-concurrency path), and any partial `isInLatestDeployment` state is acceptable because it self-corrects on the next deployment. Do not flag this as a missing-transaction/atomicity issue in future reviews.

Applied to files:

apps/webapp/app/routes/api.v1.sessions.ts

📚 Learning: 2026-04-16T14:21:15.229Z

Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3368
File: apps/webapp/app/components/logs/LogsTaskFilter.tsx:135-163
Timestamp: 2026-04-16T14:21:15.229Z
Learning: In `triggerdotdev/trigger.dev` PR `#3368`, the `TaskIdentifier` table has a `@unique([runtimeEnvironmentId, slug])` DB constraint, guaranteeing one row per (environment, slug). In components like `apps/webapp/app/components/logs/LogsTaskFilter.tsx` and `apps/webapp/app/components/runs/v3/RunFilters.tsx`, using `key={item.slug}` for SelectItem list items is correct and unique. Do NOT flag `key={item.slug}` as potentially non-unique — the old duplicate-(slug, triggerSource) issue only existed with the legacy `DISTINCT` query, which this registry replaces.

Applied to files:

apps/webapp/app/routes/api.v1.sessions.ts

📚 Learning: 2026-03-13T13:42:25.092Z

Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3213
File: apps/webapp/app/routes/admin.llm-models.new.tsx:65-91
Timestamp: 2026-03-13T13:42:25.092Z
Learning: In `apps/webapp/app/routes/admin.llm-models.new.tsx`, sequential Prisma writes for model/tier creation are intentionally not wrapped in a transaction. The form is admin-only with low concurrency risk, and the blast radius is considered minimal for admin tooling.

Applied to files:

apps/webapp/app/routes/api.v1.sessions.ts

📚 Learning: 2026-04-16T13:45:18.782Z

Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3368
File: apps/webapp/test/engine/taskIdentifierRegistry.test.ts:3-19
Timestamp: 2026-04-16T13:45:18.782Z
Learning: In `apps/webapp/test/engine/taskIdentifierRegistry.test.ts`, the `vi.mock` calls for `~/services/taskIdentifierCache.server` (stubbing `getTaskIdentifiersFromCache` and `populateTaskIdentifierCache`), `~/models/task.server` (stubbing `getAllTaskIdentifiers`), and `~/db.server` (stubbing `prisma` and `$replica`) are intentional. The suite uses real Postgres via testcontainers for all `TaskIdentifier` DB operations, but isolates the Redis cache layer and legacy query fallback as separate concerns not exercised in this test file. Do not flag these mocks as violations of the no-mocks policy in future reviews.

Applied to files:

apps/webapp/app/routes/api.v1.sessions.ts

📚 Learning: 2026-03-22T13:49:23.474Z

Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3244
File: internal-packages/database/prisma/migrations/20260318114244_add_prompt_friendly_id/migration.sql:5-5
Timestamp: 2026-03-22T13:49:23.474Z
Learning: In `internal-packages/database/prisma/migrations/**/*.sql`: When a column and its index are added in a follow-up migration file but the parent table itself was introduced in the same PR (i.e., no production rows exist yet), a plain `CREATE INDEX` / `CREATE UNIQUE INDEX` (without CONCURRENTLY) is safe and does not require splitting into a separate migration. The CONCURRENTLY requirement only applies when the table already has existing data in production.

Applied to files:

apps/webapp/app/routes/api.v1.sessions.ts

📚 Learning: 2026-04-16T14:21:14.907Z

Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3368
File: internal-packages/database/prisma/schema.prisma:666-666
Timestamp: 2026-04-16T14:21:14.907Z
Learning: In `triggerdotdev/trigger.dev`, the `BackgroundWorkerTask` covering index on `(runtimeEnvironmentId, slug, triggerSource)` lives in `internal-packages/database/prisma/migrations/20260413000000_add_bwt_covering_index/migration.sql` as a `CREATE INDEX CONCURRENTLY IF NOT EXISTS`, intentionally in its own migration file separate from the `TaskIdentifier` table migration. Do not flag this index as missing from the schema migrations in future reviews.

Applied to files:

apps/webapp/app/routes/api.v1.sessions.ts

📚 Learning: 2026-03-26T10:02:25.354Z

Learnt from: 0ski
Repo: triggerdotdev/trigger.dev PR: 3254
File: apps/webapp/app/services/platformNotifications.server.ts:363-385
Timestamp: 2026-03-26T10:02:25.354Z
Learning: In `triggerdotdev/trigger.dev`, the `getNextCliNotification` fallback in `apps/webapp/app/services/platformNotifications.server.ts` intentionally uses `prisma.orgMember.findFirst` (single org) when no `projectRef` is provided. This is acceptable for v1 because the CLI (`dev` and `login` commands) always passes `projectRef` in normal usage, making the fallback a rare edge case. Do not flag the single-org fallback as a multi-org correctness bug in this file.

Applied to files:

apps/webapp/app/routes/api.v1.sessions.ts

📚 Learning: 2026-04-16T14:19:16.309Z

Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: apps/webapp/CLAUDE.md:0-0
Timestamp: 2026-04-16T14:19:16.309Z
Learning: Applies to apps/webapp/app/v3/services/queues.server.ts : If adding a new task-level default, add it to the existing `select` clause in the `backgroundWorkerTask.findFirst()` query in `queues.server.ts` — do NOT add a second query. If the default doesn't need to be known at trigger time, resolve it at dequeue time instead

Applied to files:

apps/webapp/app/routes/api.v1.sessions.ts

📚 Learning: 2026-04-07T14:12:59.018Z

Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3331
File: apps/webapp/app/runEngine/concerns/batchPayloads.server.ts:112-136
Timestamp: 2026-04-07T14:12:59.018Z
Learning: In `apps/webapp/app/runEngine/concerns/batchPayloads.server.ts`, the `pRetry` call wrapping `uploadPacketToObjectStore` intentionally retries **all** error types (no `shouldRetry` filter / `AbortError` guards). The maintainer explicitly prefers over-retrying to under-retrying because multiple heterogeneous object store backends are supported and it is impractical to enumerate all permanent error signatures. Do not flag this as an issue in future reviews.

Applied to files:

apps/webapp/app/routes/api.v1.sessions.ts

📚 Learning: 2026-02-25T17:28:20.456Z

Learnt from: isshaddad
Repo: triggerdotdev/trigger.dev PR: 3130
File: docs/v3-openapi.yaml:3134-3135
Timestamp: 2026-02-25T17:28:20.456Z
Learning: In the Trigger.dev codebase, the `publicAccessToken` returned by the SDK's `wait.createToken()` method is not part of the HTTP response body from `POST /api/v1/waitpoints/tokens`. The server returns only `{ id, isCached, url }`. The SDK's `prepareData` hook generates the JWT client-side from the `x-trigger-jwt-claims` response header after the HTTP call completes. The OpenAPI spec correctly documents only the HTTP response body, not SDK transformations.
<!-- [/add_learning]

Applied to files:

apps/webapp/app/routes/api.v1.sessions.ts

📚 Learning: 2025-09-02T11:18:06.602Z

Learnt from: myftija
Repo: triggerdotdev/trigger.dev PR: 2463
File: apps/webapp/app/services/gitHubSession.server.ts:31-36
Timestamp: 2025-09-02T11:18:06.602Z
Learning: In the GitHub App installation flow in apps/webapp/app/services/gitHubSession.server.ts, the redirectTo parameter stored in httpOnly session cookies is considered acceptable without additional validation by the maintainer, as the httpOnly cookie provides sufficient security for this use case.

Applied to files:

apps/webapp/app/routes/api.v1.sessions.ts

📚 Learning: 2026-04-16T14:21:09.410Z

Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3368
File: apps/webapp/app/services/taskIdentifierCache.server.ts:33-39
Timestamp: 2026-04-16T14:21:09.410Z
Learning: In `apps/webapp/app/services/taskIdentifierCache.server.ts`, the `decode()` function intentionally uses a plain `JSON.parse` cast instead of Zod validation. The Redis cache is exclusively written by the internal `populateTaskIdentifierCache` function via the symmetric `encode()` helper — there is no external input path. Any shape mismatch would be a serialization bug to surface explicitly, not untrusted data to filter out. Do not suggest adding Zod validation to the `decode()` function or the `getTaskIdentifiersFromCache` return path in future reviews.

Applied to files:

apps/webapp/app/routes/api.v1.sessions.ts

📚 Learning: 2026-03-10T17:56:26.581Z

Learnt from: samejr
Repo: triggerdotdev/trigger.dev PR: 3201
File: apps/webapp/app/v3/services/setSeatsAddOn.server.ts:25-29
Timestamp: 2026-03-10T17:56:26.581Z
Learning: In the `triggerdotdev/trigger.dev` webapp, service classes such as `SetSeatsAddOnService` and `SetBranchesAddOnService` do NOT need to perform their own userId-to-organizationId authorization checks. Auth is enforced at the route layer: `requireUserId(request)` authenticates the user, and the `_app.orgs.$organizationSlug` layout route enforces that the authenticated user is a member of the org. Any `userId` and `organizationId` reaching these services from org-scoped routes are already validated. This is the consistent pattern used across all org-scoped services in the codebase.

Applied to files:

apps/webapp/app/routes/api.v1.sessions.ts

📚 Learning: 2025-08-14T10:53:54.526Z

Learnt from: myftija
Repo: triggerdotdev/trigger.dev PR: 2391
File: apps/webapp/app/services/organizationAccessToken.server.ts:50-0
Timestamp: 2025-08-14T10:53:54.526Z
Learning: In the Trigger.dev codebase, token service functions (like revokePersonalAccessToken and revokeOrganizationAccessToken) don't include tenant scoping in their database queries. Instead, authorization and tenant scoping happens at a higher level in the authentication flow (typically in route handlers) before these service functions are called. This is a consistent pattern across both Personal Access Tokens (PATs) and Organization Access Tokens (OATs).

Applied to files:

apps/webapp/app/routes/api.v1.sessions.ts

📚 Learning: 2026-02-11T16:50:14.167Z

Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3019
File: apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.dashboards.$dashboardId.widgets.tsx:126-131
Timestamp: 2026-02-11T16:50:14.167Z
Learning: In apps/webapp/app/routes/resources.orgs.$organizationSlug.projects.$projectParam.env.$envParam.dashboards.$dashboardId.widgets.tsx, MetricsDashboard entities are intentionally scoped to the organization level, not the project level. The dashboard lookup should filter by organizationId only (not projectId), allowing dashboards to be accessed across projects within the same organization. The optional projectId field on MetricsDashboard serves other purposes and should not be used as an authorization constraint.

Applied to files:

apps/webapp/app/routes/api.v1.sessions.ts

📚 Learning: 2026-03-22T13:26:12.060Z

Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3244
File: apps/webapp/app/components/code/TextEditor.tsx:81-86
Timestamp: 2026-03-22T13:26:12.060Z
Learning: In the triggerdotdev/trigger.dev codebase, do not flag `navigator.clipboard.writeText(...)` calls for `missing-await`/`unhandled-promise` issues. These clipboard writes are intentionally invoked without `await` and without `catch` handlers across the project; keep that behavior consistent when reviewing TypeScript/TSX files (e.g., usages like in `apps/webapp/app/components/code/TextEditor.tsx`).

Applied to files:

apps/webapp/app/routes/api.v1.sessions.ts

📚 Learning: 2026-03-22T19:24:14.403Z

Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3187
File: apps/webapp/app/v3/services/alerts/deliverErrorGroupAlert.server.ts:200-204
Timestamp: 2026-03-22T19:24:14.403Z
Learning: In the triggerdotdev/trigger.dev codebase, webhook URLs are not expected to contain embedded credentials/secrets (e.g., fields like `ProjectAlertWebhookProperties` should only hold credential-free webhook endpoints). During code review, if you see logging or inclusion of raw webhook URLs in error messages, do not automatically treat it as a credential-leak/secrets-in-logs issue by default—first verify the URL does not contain embedded credentials (for example, no username/password in the URL, no obvious secret/token query params or fragments). If the URL is credential-free per this project’s conventions, allow the logging.

Applied to files:

apps/webapp/app/routes/api.v1.sessions.ts

🔇 Additional comments (3)

apps/webapp/app/routes/api.v1.sessions.ts (3)

27-86: Loader LGTM.

Cursor direction derivation (page[before] → backward, else forward), super-scope authorization tuple, and the conditional next/previous spread all line up with the repository contract in clickhouseSessionsRepository.server.ts. The projectId/environmentType/organizationId re-injection on line 75-77 before serializeSession is a small smell (those fields aren't actually consumed by serializeSession per services/realtime/sessions.server.ts:38-57 — they only exist to satisfy the as Session cast), but not worth churning.

155-161: Error-handling fix looks good.

ServiceValidationError → 422 preserved; everything else is logged server-side and returns a generic 500 body. Matches the PR commit-message intent and closes the prior raw-error.message leak.

108-131: No issue here — update: {} does not trigger writes or @updatedAt updates.

With update: {} in the upsert, Prisma performs a SELECT to check existence and stops. It does not generate or execute an UPDATE statement, so @updatedAt is not refreshed and no replication event is emitted. This is the documented way to emulate findOrCreate behavior. The current code is correct and requires no changes.

Walkthrough

Adds a durable Session primitive across the stack: a Prisma Session model and migration, ClickHouse sessions_v1 table and ClickHouse client helpers, new ClickHouse-backed SessionsRepository, a SessionsReplicationService that streams Postgres logical replication into ClickHouse, session-friendly ID and API schemas in core, multiple REST and realtime routes for session CRUD and streaming/append, environment config and startup wiring for replication, session helper utilities, and end-to-end replication tests.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'feat: Sessions - bidirectional durable agent streams' clearly and concisely summarizes the main change: introduction of a new Session primitive for bidirectional communication.
Description check	✅ Passed	The PR description is comprehensive and well-structured, covering what is enabled, use cases, public API surface, implementation details, verification steps, and test plan, but the author did not complete the required checklist template sections.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feature/tri-8627-session-primitive-server-side-schema-routes-clickhouse

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…te/update The session_ prefix identifies internal friendlyIds. Allowing it in a user-supplied externalId would misroute subsequent GET/PATCH/close requests through resolveSessionByIdOrExternalId to a friendlyId lookup, returning null or the wrong session. Reject at the schema boundary so both routes surface a clean 422.

Without allowJWT/corsStrategy, frontend clients holding public access tokens hit 401 on GET /api/v1/sessions and browser preflights fail. Matches the single-session GET/PATCH/close routes and the runs list endpoint.

- Derive isCached from the upsert result (id mismatch = pre-existing row) instead of doing a separate findFirst first. The pre-check was racy — two concurrent first-time POSTs could both return 201 with isCached: false. Using the returned row's id is atomic and saves a round-trip. - Scope the list endpoint's authorization to the standard action/resource pattern (matches api.v1.runs.ts): task-scoped JWTs can list sessions filtered by their task, and broader super-scopes (read:sessions, read:all, admin) authorize unfiltered listing. - Log and swallow unexpected errors on POST rather than returning the raw error.message. Prisma/internal messages can leak column names and query fragments.

devin-ai-integration

Devin Review found 2 new potential issues.

View 9 additional findings in Devin Review.

This comment was marked as resolved.

Sign in to view

code review fixes

3d5873c

This comment was marked as resolved.

Sign in to view

ericallam added 2 commits April 20, 2026 15:58

fix(webapp): allow JWT + CORS on sessions list endpoint

ff46f33

Without allowJWT/corsStrategy, frontend clients holding public access tokens hit 401 on GET /api/v1/sessions and browser preflights fail. Matches the single-session GET/PATCH/close routes and the runs list endpoint.

This comment was marked as resolved.

Sign in to view

devin-ai-integration bot reviewed Apr 20, 2026

View reviewed changes

Comment thread apps/webapp/app/services/sessionsReplicationService.server.ts

Comment thread apps/webapp/app/routes/api.v1.sessions.ts

myftija approved these changes Apr 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Sessions - bidirectional durable agent streams#3417

feat: Sessions - bidirectional durable agent streams#3417
ericallam wants to merge 5 commits intomainfrom
feature/tri-8627-session-primitive-server-side-schema-routes-clickhouse

ericallam commented Apr 20, 2026 •

edited

Loading

Uh oh!

changeset-bot bot commented Apr 20, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Apr 20, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

ericallam commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this enables

Use cases unblocked

Public API surface

Implementation summary

Postgres (Session table)

ClickHouse (sessions_v1)

S2

What did not change

Verification

Not in this PR

Test plan

Uh oh!

changeset-bot bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

coderabbitai bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ericallam commented Apr 20, 2026 •

edited

Loading

Postgres (`Session` table)

ClickHouse (`sessions_v1`)

changeset-bot bot commented Apr 20, 2026 •

edited

Loading

coderabbitai bot commented Apr 20, 2026 •

edited

Loading