Skip to content

feat: add System Status page with Celery queue monitoring and purge#14349

Merged
Maffooch merged 10 commits intoDefectDojo:devfrom
valentijnscholten:feature/celery-queue-status-ui
Apr 3, 2026
Merged

feat: add System Status page with Celery queue monitoring and purge#14349
Maffooch merged 10 commits intoDefectDojo:devfrom
valentijnscholten:feature/celery-queue-status-ui

Conversation

@valentijnscholten
Copy link
Copy Markdown
Member

@valentijnscholten valentijnscholten commented Feb 19, 2026

Summary

Add a Celery Status page showing the status of the broker and the worker(s) and the number of tasks in the queue. The queue can also be purged.
A separate API call and button to get more details such as the number of tasks per "task name" and a purge button per "task name".

  • New System Status page (/system_status) accessible to superusers from the navigation menu, alongside System Settings. Displays Celery worker liveness, pending queue length, and active task timeout/expiry settings.
  • Separate Redis Broker and Celery Worker status badges so it is immediately clear which component is failing. Both show a spinner while loading.
  • Loading spinners on page load and after clicking Refresh — all status indicators reset to a spinner state before re-fetching.
  • Purge queue button to remove all pending tasks (with a warning about re-running deduplication via python manage.py dedupe). Reloads the page after a successful purge.
  • View Details button (next to Refresh) — triggers an O(N) full queue scan and renders a per-task breakdown table. Disabled and shown with a warning when the broker is unreachable. Shows a spinner while loading.
    • Table columns: #, task name, count, oldest queue position, newest queue position, expiry timestamps (human-readable local time + time-left indicator), per-row purge button.
    • Rows sorted by oldest queue position ascending.
  • Per-task purge — each row in the breakdown table has a Purge button that removes all queued tasks with that task name via a targeted Redis LREM pipeline, without affecting other tasks.
  • Celery Settings moved into its own panel section below the status panel.
  • Four superuser-only REST API endpoints (same permission guards as SystemSettingsViewSet):
    • GET /api/v2/celery/status/ — worker status, broker status, queue length, and task config
    • POST /api/v2/celery/queue/purge/ — purge all pending tasks from the broker queue
    • GET /api/v2/celery/queue/details/ — per-task breakdown (O(N) queue scan)
    • POST /api/v2/celery/queue/task/purge/ — purge all queued tasks matching a given task name
  • Fix get_celery_worker_status() to use app.control.ping() via Celery's pidbox control channel instead of dispatching a task. The old approach would hang indefinitely when the task queue was clogged.
  • New environment variables in settings.dist.py:
    • DD_CELERY_TASK_TIME_LIMIT (default: 43200 = 12 h) — hard kill via SIGKILL after this many seconds
    • DD_CELERY_TASK_SOFT_TIME_LIMIT (default: disabled) — raises SoftTimeLimitExceeded for graceful cleanup
    • DD_CELERY_TASK_DEFAULT_EXPIRES (default: 43200 = 12 h) — silently discard tasks that have been waiting in the queue longer than this before any worker picks them up
  • Client-side AJAX rendering of the status panel so that dojo-pro can consume the same API endpoints without duplicating the view logic.
Screenshot 2026-03-28 095900

@github-actions github-actions bot added settings_changes Needs changes to settings.py based on changes in settings.dist.py included in this PR apiv2 ui labels Feb 19, 2026
@valentijnscholten valentijnscholten added this to the 2.56.0 milestone Feb 19, 2026
Copy link
Copy Markdown
Contributor

@Maffooch Maffooch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great start! Would it be possible to iterate over the task names, and get a count of those currently queued? That would allow for easier debugging into where bottlenecks are

- Add dedicated System Status page (/system_status) with superuser-only
  access, accessible from the navigation menu alongside System Settings
- Display Celery worker liveness, pending queue length with human-readable
  duration formatting, and active task timeout/expiry configuration
- Add Purge queue button that POSTs to the new API endpoint and reloads
  the page on success
- Fix get_celery_worker_status() to use app.control.ping() via the
  pidbox control channel, which works correctly even when the task queue
  is clogged (previously dispatched a task that would never be picked up)
- Add purge_celery_queue() utility using a direct broker connection
- Add two new superuser-only REST API endpoints:
    GET  /api/v2/celery/status/       - worker status, queue length, config
    POST /api/v2/celery/queue/purge/  - purge all pending tasks
  Both use the same permission guards as SystemSettingsViewSet
  (IsSuperUser + DjangoModelPermissions against System_Settings)
- Add DD_CELERY_TASK_TIME_LIMIT (default 12h), DD_CELERY_TASK_SOFT_TIME_LIMIT
  (default disabled), and DD_CELERY_TASK_DEFAULT_EXPIRES (default 12h)
  environment variables to settings.dist.py with explanatory comments
- Move celery status rendering from server-side Django view to client-side
  AJAX so dojo-pro can consume the same API endpoints

feat: add Refresh button next to Purge button on System Status page

remove plan
@valentijnscholten valentijnscholten force-pushed the feature/celery-queue-status-ui branch from c29ee67 to 73f690b Compare February 26, 2026 08:04
@valentijnscholten
Copy link
Copy Markdown
Member Author

This is a great start! Would it be possible to iterate over the task names, and get a count of those currently queued? That would allow for easier debugging into where bottlenecks are

Redis doesn't provide a lightweight way to do this. The implementation would have to iterate over all entries in the queue and get the task name & count them. I didn't add that as it would be slow probably for clogged instances. But I could do a test to see if it's acceptable and put this behind a "Statistics" button to trigger it only when the user needs the extra level of detail.

@Maffooch Maffooch modified the milestones: 2.56.0, 2.57.0 Feb 27, 2026
- Split status display into separate Redis Broker and Celery Worker badges
- Add loading spinners on page load and refresh for all status indicators
- Add per-task queue breakdown table behind a 'View Details' button (O(N)
  scan with warning); shows task name, count, oldest/newest queue position,
  and expiry timestamps with human-readable time-left
- Add per-row purge button to remove all queued tasks by task name
- Add global queue purge and per-task purge API endpoints
- Move Celery settings table into its own panel section
…figurable settings

- Add INFO log on start/finish and DEBUG log per batch in purge_celery_queue_by_task_name
- Cap per-task purge at 10,000 tasks with a WARNING when the cap is hit
- Make batch size and max tasks configurable via DD_CELERY_QUEUE_PURGE_BATCH_SIZE and DD_CELERY_QUEUE_PURGE_MAX_TASKS
- After per-task purge, refresh both status and details table instead of removing the row
- Refresh details table when clicking the Refresh button while details are visible
- Add note below details table referencing the two env variables
@valentijnscholten
Copy link
Copy Markdown
Member Author

valentijnscholten commented Mar 28, 2026

This is a great start! Would it be possible to iterate over the task names, and get a count of those currently queued? That would allow for easier debugging into where bottlenecks are

Redis doesn't provide a lightweight way to do this. The implementation would have to iterate over all entries in the queue and get the task name & count them. I didn't add that as it would be slow probably for clogged instances. But I could do a test to see if it's acceptable and put this behind a "Statistics" button to trigger it only when the user needs the extra level of detail.

@Maffooch Added a View Details button which will render a table with number of tasks per tasks, some info and a per task name purge button. Getting the table with details is not very slow, couple of seconds even with 100k tasks queued.

…d purge

- Replaced Kombu channel.client with redis.from_url() in both
  get_celery_queue_details() and purge_celery_queue_by_task_name()
  so both functions use the same connection mechanism
- Batched pipeline approach for per-task purge to avoid hitting
  Valkey's max query buffer limit on large queues
@Maffooch Maffooch merged commit 1443f31 into DefectDojo:dev Apr 3, 2026
157 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

apiv2 settings_changes Needs changes to settings.py based on changes in settings.dist.py included in this PR ui unittests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants