Maintenance & Automation Schedule

Reading Progress77%

Appendix D: Maintenance & Automation Schedule

This appendix is the single source of truth for all scheduled maintenance: background workers, data retention, billing reconciliation, and GitHub Actions. Constants live in backend/utils/retention.py.

Continuous Background Workers

Service Command Cadence Responsibility
Outbox Relay /opt/venv/bin/python relay_start.py Every 5s Publishes OutboxEvent rows to Redpanda
Telemetry Worker python manage.py telemetry_worker Continuous Kafka stream consumption
Telemetry Worker Every 30s Active service pinger (pingers.py)
Telemetry Worker Every 1h aggregate_analytics rollups
Telemetry Worker Every 6h Lighthouse quality scans
ML Worker python manage.py ml_worker Continuous Kafka ml-training-events consumer
ML Worker Every 24h train_all_models (SLA, Threat, CES models)
Security Worker python manage.py security_worker Every 1h fetch_threat_intel
Security Worker Every 24h (staggered) db_cleanup, scan_dark_web, sync_subscriptions, VACUUM ANALYZE

Daily security jobs are staggered (compliance at T+0h, dark-web at T+1h, billing at T+2h, vacuum at T+3h) to avoid thundering-herd load on Postgres and Stripe.

Data Retention (db_cleanup)

Owned exclusively by security_worker (not train_all_models). Policy constants: RAW_TELEMETRY_RETENTION_DAYS = 30.

Data Class Retention Action
Endpoints (raw ping telemetry) 30 days Deleted
AuditLog 30 days Deleted
CookieConsent 30 days Deleted
OutboxEvent (published) 30 days Deleted
OutboxEvent (DLQ, ≥5 failed attempts) 7 days Deleted
ThreatIntelligence Indefinite Legacy duplicates removed only
BugReport, ThreatReport, TrainingRun Indefinite Kept as system of record
ClickHouse OLAP spans 30 days TTL in OTEL collector config
GCS object storage 30 days Terraform lifecycle rule

Billing & Account Lifecycle

Mechanism Type Schedule Details
sync_subscriptions Scheduled Daily (security_worker) Stripe sweep: upgrades active subs, downgrades lapsed Pro users; preserves manual Pro grants (tier=Pro, no stripe_customer_id)
Stripe webhooks Real-time On event checkout.session.completed, customer.subscription.updated/deleted
POST /api/v1/billing/sync On-demand User-initiated Manual subscription reconciliation
DELETE /api/v1/auth/delete-account On-demand User-initiated Django CASCADE deletes tenant data; Firebase identity removed client-side

There is no scheduled purge of dormant accounts or orphaned Stripe customers.

GitHub Actions (Repository Maintenance)

Workflow Cadence Purpose
renovate.yml Weekly (Sun 00:00 UTC) Dependency update PRs
30-60-90-automation.yml Monthly (1st) Semgrep SAST, npm audit, uv lock
30-60-90-automation.yml Quarterly (Jan/Apr/Jul/Oct 1st) Frontend build audit, ruff check
ci-tests.yml Push/PR to main Backend pytest + frontend vitest
huggingface-space.yml Push to main HF Space model artifact sync
purge-cloudflare-cache.yml On deploy CDN cache invalidation
firebase-backend-deploy.yml Push to main Cloud Functions + Firestore rules
firebase-hosting-*.yml Push/PR to main Marketing site deploy

[!NOTE] Daily ML training and threat-intel fetch run in Django workers, not GitHub Actions. There is no daily-automation.yml workflow.