feat(jobs): Add data retention jobs#4128
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
@TheodoreSpeaks let's consolidate the migrations into a single one, just delete the existing ones and run it once over all the changes in shcema.ts |
57194bd to
a3c1bab
Compare
a3c1bab to
63f7b59
Compare
|
@BugBot review |
PR SummaryHigh Risk Overview Cron endpoints are added for Introduces an enterprise-only Data Retention settings page and Reviewed by Cursor Bugbot for commit f546653. Bugbot is set up for automated code reviews on this repo. Configure here. |
|
@BugBot review |
0cb8970 to
9cb8dba
Compare
Greptile SummaryAdds three data retention background jobs (soft-delete cleanup, log cleanup, task/chat cleanup) dispatched via Trigger.dev or an inline fallback, with an enterprise-gated UI and API for per-workspace configuration. The migration replaces full soft-delete indexes with partial indexes and adds three retention columns to
Confidence Score: 3/5Not safe to merge as-is: the S3 cleanup gap will permanently orphan workspace_file objects in object storage on every cleanup run. One confirmed P1 data-integrity bug (workspace_file S3 objects never deleted) that will silently accumulate orphaned cloud storage objects on each cron execution. Everything else — batching logic, auth, migration, Trigger.dev wiring, enterprise UI — is well-structured. apps/sim/background/cleanup-soft-deletes.ts — cleanupWorkspaceFileStorage must also cover the workspaceFile (singular) table Important Files Changed
Sequence DiagramsequenceDiagram
participant Cron as Cron (GET /api/cron/*)
participant Dispatcher as dispatchCleanupJobs
participant Queue as JobQueue (Trigger.dev / DB)
participant Task as Background Task
participant DB as Database
participant S3 as Object Storage
participant Copilot as Copilot Backend
Cron->>Dispatcher: dispatchCleanupJobs(jobType, retentionColumn)
Dispatcher->>Queue: enqueue free-tier job
Dispatcher->>Queue: enqueue paid-tier job
Dispatcher->>DB: query enterprise workspaces with non-NULL retention
Dispatcher->>Queue: batchTrigger enterprise jobs
Queue->>Task: run(payload)
Task->>DB: resolveTierWorkspaceIds or lookup workspace retention
Task->>DB: SELECT expiring rows (batched, LIMIT 2000)
Task->>S3: delete associated files (pre-deletion)
Task->>Copilot: POST /api/tasks/cleanup (chat IDs)
Task->>DB: DELETE rows by ID
Task-->>Queue: complete
Reviews (2): Last reviewed commit: "fix lint" | Re-trigger Greptile |
|
@greptile review |
Add 3 cron-triggered cleanup jobs dispatched via Trigger.dev (or inline fallback): - cleanup-soft-deletes: hard-deletes soft-deleted workspace resources past retention - cleanup-logs: deletes expired workflow execution logs + S3 files - cleanup-tasks: deletes expired copilot chats, runs, feedback, inbox tasks Enterprise admins can configure per-workspace retention via Settings > Data Retention. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> # Conflicts: # packages/db/migrations/meta/0192_snapshot.json # packages/db/migrations/meta/_journal.json # packages/db/schema.ts
db339c8 to
be59d2b
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit be59d2b. Configure here.

Summary
Add data retention jobs. 3 jobs created:
Type of Change
Testing
Checklist
Screenshots/Videos