Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 41 additions & 0 deletions netlify/functions/cleanup-docs-cache-background.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
import type { Config } from '@netlify/functions'
import { pruneOldCacheEntries } from '~/utils/github-content-cache.server'

const THIRTY_DAYS_MS = 30 * 24 * 60 * 60 * 1000

const handler = async (req: Request) => {
const { next_run } = await req.json()

console.log('[cleanup-docs-cache] Starting docs cache prune...')

const startTime = Date.now()

try {
const { contentDeleted, artifactDeleted, threshold } =
await pruneOldCacheEntries(THIRTY_DAYS_MS)

const duration = Date.now() - startTime
console.log(
`[cleanup-docs-cache] Completed in ${duration}ms - Deleted ${contentDeleted.toLocaleString()} content rows and ${artifactDeleted.toLocaleString()} artifact rows older than ${threshold.toISOString()}`,
)
console.log('[cleanup-docs-cache] Next invocation at:', next_run)
} catch (error) {
const duration = Date.now() - startTime
const errorMessage = error instanceof Error ? error.message : String(error)
const errorStack = error instanceof Error ? error.stack : undefined

console.error(
`[cleanup-docs-cache] Failed after ${duration}ms:`,
errorMessage,
)
if (errorStack) {
console.error('[cleanup-docs-cache] Stack:', errorStack)
}
}
}
Comment on lines +6 to +35
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

For @netlify/functions` v5.1.5 scheduled functions:

  1. Is JSON body with next_run guaranteed for every invocation?
  2. What return/throw behavior marks a run as failed vs successful?
  3. Is returning Response recommended for scheduled handlers?`

💡 Result:

  1. Yes, the JSON body with next_run is guaranteed for every invocation of scheduled functions. Official docs state: "For scheduled functions, the request body is a JSON-encoded object containing a next_run property. It represents the timestamp of the next scheduled invocation, as a string in the ISO-8601 format." Code examples show const { next_run } = await req.json. 2) Unhandled exceptions/throws or errors mark a run as failed; successful completion without errors (regardless of return value) marks it as successful. Function logs show invocations with log levels like INFO/ERROR, and metrics track success/error rates. Similar to standard serverless functions, errors in execution are logged as failures. 3) No, returning a Response object is not required or particularly recommended for scheduled handlers, as they do not return a response body to any client (they run in the background like cron jobs). Docs note: "Scheduled functions don't support response streaming because they don't return a response body." Examples return plain objects like { statusCode: 200 } or nothing explicit, but the return value does not affect scheduling behavior.

Citations:


🏁 Script executed:

cat -n netlify/functions/cleanup-docs-cache-background.ts | head -50

Repository: TanStack/tanstack.com

Length of output: 1670


Move body parsing inside try and rethrow errors to mark failures.

await req.json() (line 7) can throw before error handling, and the catch block currently swallows errors without rethrowing, which silently hides failed prune runs. Move JSON parsing inside the try block and rethrow in the catch block to ensure Netlify marks the invocation as failed when an error occurs.

Suggested fix
const handler = async (req: Request) => {
-  const { next_run } = await req.json()
-
-  console.log('[cleanup-docs-cache] Starting docs cache prune...')
-
   const startTime = Date.now()
+  console.log('[cleanup-docs-cache] Starting docs cache prune...')

   try {
+    const { next_run } = await req.json()
+
     const { contentDeleted, artifactDeleted, threshold } =
       await pruneOldCacheEntries(THIRTY_DAYS_MS)

     const duration = Date.now() - startTime
     console.log(
       `[cleanup-docs-cache] Completed in ${duration}ms - Deleted ${contentDeleted.toLocaleString()} content rows and ${artifactDeleted.toLocaleString()} artifact rows older than ${threshold.toISOString()}`,
     )
     console.log('[cleanup-docs-cache] Next invocation at:', next_run)
   } catch (error) {
     const duration = Date.now() - startTime
     const errorMessage = error instanceof Error ? error.message : String(error)
     const errorStack = error instanceof Error ? error.stack : undefined

     console.error(
       `[cleanup-docs-cache] Failed after ${duration}ms:`,
       errorMessage,
     )
     if (errorStack) {
       console.error('[cleanup-docs-cache] Stack:', errorStack)
     }
+    throw error
   }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const handler = async (req: Request) => {
const { next_run } = await req.json()
console.log('[cleanup-docs-cache] Starting docs cache prune...')
const startTime = Date.now()
try {
const { contentDeleted, artifactDeleted, threshold } =
await pruneOldCacheEntries(THIRTY_DAYS_MS)
const duration = Date.now() - startTime
console.log(
`[cleanup-docs-cache] Completed in ${duration}ms - Deleted ${contentDeleted.toLocaleString()} content rows and ${artifactDeleted.toLocaleString()} artifact rows older than ${threshold.toISOString()}`,
)
console.log('[cleanup-docs-cache] Next invocation at:', next_run)
} catch (error) {
const duration = Date.now() - startTime
const errorMessage = error instanceof Error ? error.message : String(error)
const errorStack = error instanceof Error ? error.stack : undefined
console.error(
`[cleanup-docs-cache] Failed after ${duration}ms:`,
errorMessage,
)
if (errorStack) {
console.error('[cleanup-docs-cache] Stack:', errorStack)
}
}
}
const handler = async (req: Request) => {
const startTime = Date.now()
console.log('[cleanup-docs-cache] Starting docs cache prune...')
try {
const { next_run } = await req.json()
const { contentDeleted, artifactDeleted, threshold } =
await pruneOldCacheEntries(THIRTY_DAYS_MS)
const duration = Date.now() - startTime
console.log(
`[cleanup-docs-cache] Completed in ${duration}ms - Deleted ${contentDeleted.toLocaleString()} content rows and ${artifactDeleted.toLocaleString()} artifact rows older than ${threshold.toISOString()}`,
)
console.log('[cleanup-docs-cache] Next invocation at:', next_run)
} catch (error) {
const duration = Date.now() - startTime
const errorMessage = error instanceof Error ? error.message : String(error)
const errorStack = error instanceof Error ? error.stack : undefined
console.error(
`[cleanup-docs-cache] Failed after ${duration}ms:`,
errorMessage,
)
if (errorStack) {
console.error('[cleanup-docs-cache] Stack:', errorStack)
}
throw error
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@netlify/functions/cleanup-docs-cache-background.ts` around lines 6 - 35, The
handler currently calls await req.json() before the try/catch and also swallows
errors in the catch; move the request body parsing (await req.json()) inside the
try block (within function handler) so JSON parse errors are caught, and in the
catch block rethrow the error after logging (or throw the caught error) so
failures are propagated to Netlify; reference the handler function, the await
req.json() call, and the catch block when making the change.


export default handler

export const config: Config = {
schedule: '0 3 * * *',
}
55 changes: 25 additions & 30 deletions src/utils/github-content-cache.server.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import { and, eq, sql } from 'drizzle-orm'
import { and, eq, lt, sql } from 'drizzle-orm'
import { db } from '~/db/client'
import {
docsArtifactCache,
Expand Down Expand Up @@ -42,12 +42,6 @@ function isFresh(staleAt: Date) {
return staleAt.getTime() > Date.now()
}

function queueRefresh(key: string, fn: () => Promise<unknown>) {
void withPendingRefresh(key, fn).catch((error) => {
console.error(`[GitHub Cache] Failed to refresh ${key}:`, error)
})
}

function readStoredTextValue(row: GithubContentCache | undefined) {
if (!row) {
return undefined
Expand Down Expand Up @@ -166,19 +160,8 @@ async function getCachedGitHubContent<T>(opts: {
const cachedRow = await readRow()
const storedValue = opts.readStoredValue(cachedRow)

if (storedValue !== undefined) {
if (cachedRow && isFresh(cachedRow.staleAt)) {
return storedValue
}

if (storedValue !== null) {
queueRefresh(opts.cacheKey, async () => {
const value = await opts.origin()
await persist(value)
})

return storedValue
}
if (storedValue !== undefined && cachedRow && isFresh(cachedRow.staleAt)) {
return storedValue
}

return withPendingRefresh(opts.cacheKey, async () => {
Expand Down Expand Up @@ -297,16 +280,7 @@ export async function getCachedDocsArtifact<T>(opts: {
const storedValue =
cachedRow && opts.isValue(cachedRow.payload) ? cachedRow.payload : undefined

if (storedValue !== undefined) {
if (cachedRow && isFresh(cachedRow.staleAt)) {
return storedValue
}

queueRefresh(cacheKey, async () => {
const payload = await opts.build()
await upsertDocsArtifact({ ...opts, payload })
})

if (storedValue !== undefined && cachedRow && isFresh(cachedRow.staleAt)) {
return storedValue
}

Expand Down Expand Up @@ -382,6 +356,27 @@ export async function markGitHubContentStale(
return rowCount
}

export async function pruneOldCacheEntries(olderThanMs: number) {
const threshold = new Date(Date.now() - olderThanMs)
Comment on lines +359 to +360
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard olderThanMs to prevent accidental mass deletion.

A non-finite or non-positive olderThanMs can shift the threshold forward and delete far more than intended. Add an explicit runtime guard before computing threshold.

Suggested fix
 export async function pruneOldCacheEntries(olderThanMs: number) {
+  if (!Number.isFinite(olderThanMs) || olderThanMs <= 0) {
+    throw new Error('olderThanMs must be a positive finite number')
+  }
   const threshold = new Date(Date.now() - olderThanMs)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export async function pruneOldCacheEntries(olderThanMs: number) {
const threshold = new Date(Date.now() - olderThanMs)
export async function pruneOldCacheEntries(olderThanMs: number) {
if (!Number.isFinite(olderThanMs) || olderThanMs <= 0) {
throw new Error('olderThanMs must be a positive finite number')
}
const threshold = new Date(Date.now() - olderThanMs)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/github-content-cache.server.ts` around lines 385 - 386, Guard the
pruneOldCacheEntries function against invalid olderThanMs values: before
computing threshold, validate that olderThanMs is a finite number greater than
zero (e.g., Number.isFinite(olderThanMs) && olderThanMs > 0); if the check
fails, throw or return early to avoid computing a bad threshold and accidentally
deleting too many entries. Ensure the validation sits at the top of
pruneOldCacheEntries so threshold = new Date(Date.now() - olderThanMs) only runs
with a safe value.


const [contentDeleted, artifactDeleted] = await Promise.all([
db
.delete(githubContentCache)
.where(lt(githubContentCache.updatedAt, threshold))
.returning({ repo: githubContentCache.repo }),
db
.delete(docsArtifactCache)
.where(lt(docsArtifactCache.updatedAt, threshold))
.returning({ repo: docsArtifactCache.repo }),
Comment thread
coderabbitai[bot] marked this conversation as resolved.
])
Comment on lines +362 to +371
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Avoid returning every deleted row just to compute counts.

Using .returning({ repo }) pulls one result row per deleted record into memory. On large prunes, this can become a hot-path memory/latency issue. Prefer DB-side counting (e.g., CTE DELETE ... RETURNING 1 + COUNT(*)) so only scalar counts are returned.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/utils/github-content-cache.server.ts` around lines 388 - 397, The current
delete calls load a row per deleted record by using returning({ repo: ... }) on
githubContentCache and docsArtifactCache, which can blow memory on large prunes;
change each delete to perform the count server-side instead (e.g., use a CTE
that does DELETE ... RETURNING 1 and then SELECT COUNT(*) from that CTE, or run
a single raw SQL statement that deletes and returns a scalar count) so only
integer counts are returned; update the calls around db.delete(...) for
githubContentCache and docsArtifactCache (and any code that reads
contentDeleted/artifactDeleted) to accept numeric counts rather than arrays of
rows.


return {
contentDeleted: contentDeleted.length,
artifactDeleted: artifactDeleted.length,
threshold,
}
}

export async function markDocsArtifactsStale(
opts: {
gitRef?: string
Expand Down
Loading