Appendices · #D
Maintenance & Automation Schedule
Appendix D: Maintenance & Automation Schedule
This appendix is the single source of truth for all scheduled maintenance: background workers, data retention, billing reconciliation, and GitHub Actions. Constants live in backend/utils/retention.py.
Continuous Background Workers
| Service | Command | Cadence | Responsibility |
|---|---|---|---|
| Outbox Relay | /opt/venv/bin/python relay_start.py |
Every 5s | Publishes OutboxEvent rows to Redpanda |
| Telemetry Worker | python manage.py telemetry_worker |
Continuous | Kafka stream consumption |
| Telemetry Worker | ↑ | Every 30s | Active service pinger (pingers.py) |
| Telemetry Worker | ↑ | Every 1h | aggregate_analytics rollups |
| Telemetry Worker | ↑ | Every 6h | Lighthouse quality scans |
| ML Worker | python manage.py ml_worker |
Continuous | Kafka ml-training-events consumer |
| ML Worker | ↑ | Every 24h | train_all_models (SLA, Threat, CES models) |
| Security Worker | python manage.py security_worker |
Every 1h | fetch_threat_intel |
| Security Worker | ↑ | Every 24h (staggered) | db_cleanup, scan_dark_web, sync_subscriptions, VACUUM ANALYZE |
Daily security jobs are staggered (compliance at T+0h, dark-web at T+1h, billing at T+2h, vacuum at T+3h) to avoid thundering-herd load on Postgres and Stripe.
Data Retention (db_cleanup)
Owned exclusively by security_worker (not train_all_models). Policy constants: RAW_TELEMETRY_RETENTION_DAYS = 30.
| Data Class | Retention | Action |
|---|---|---|
Endpoints (raw ping telemetry) |
30 days | Deleted |
AuditLog |
30 days | Deleted |
CookieConsent |
30 days | Deleted |
OutboxEvent (published) |
30 days | Deleted |
OutboxEvent (DLQ, ≥5 failed attempts) |
7 days | Deleted |
ThreatIntelligence |
Indefinite | Legacy duplicates removed only |
BugReport, ThreatReport, TrainingRun |
Indefinite | Kept as system of record |
| ClickHouse OLAP spans | 30 days | TTL in OTEL collector config |
| GCS object storage | 30 days | Terraform lifecycle rule |
Billing & Account Lifecycle
| Mechanism | Type | Schedule | Details |
|---|---|---|---|
sync_subscriptions |
Scheduled | Daily (security_worker) | Stripe sweep: upgrades active subs, downgrades lapsed Pro users; preserves manual Pro grants (tier=Pro, no stripe_customer_id) |
| Stripe webhooks | Real-time | On event | checkout.session.completed, customer.subscription.updated/deleted |
POST /api/v1/billing/sync |
On-demand | User-initiated | Manual subscription reconciliation |
DELETE /api/v1/auth/delete-account |
On-demand | User-initiated | Django CASCADE deletes tenant data; Firebase identity removed client-side |
There is no scheduled purge of dormant accounts or orphaned Stripe customers.
GitHub Actions (Repository Maintenance)
| Workflow | Cadence | Purpose |
|---|---|---|
renovate.yml |
Weekly (Sun 00:00 UTC) | Dependency update PRs |
30-60-90-automation.yml |
Monthly (1st) | Semgrep SAST, npm audit, uv lock |
30-60-90-automation.yml |
Quarterly (Jan/Apr/Jul/Oct 1st) | Frontend build audit, ruff check |
ci-tests.yml |
Push/PR to main |
Backend pytest + frontend vitest |
huggingface-space.yml |
Push to main |
HF Space model artifact sync |
purge-cloudflare-cache.yml |
On deploy | CDN cache invalidation |
firebase-backend-deploy.yml |
Push to main |
Cloud Functions + Firestore rules |
firebase-hosting-*.yml |
Push/PR to main |
Marketing site deploy |
[!NOTE] Daily ML training and threat-intel fetch run in Django workers, not GitHub Actions. There is no
daily-automation.ymlworkflow.