Appendices · #H
Background Schedulers & Asynchronous Workflows
Appendix H: Background Schedulers & Asynchronous Workflows
The DEML Platform orchestrates several asynchronous background workers. These workers run continuously to process Redpanda events, trigger periodic machine learning pipelines, and enforce strict DevSecOps compliance. See Appendix D for the consolidated schedule table.
1. Telemetry Worker (telemetry_worker.py)
- Stream Processing (Continuous): Consumes Redpanda topics (
app-events,user-issues) and projects events into Postgres + Firestore. - Data Aggregation (1 Hour): Triggers
aggregate_analyticsevery 3,600 seconds to roll up raw OTLP traces into historical charts. - Active Pinger (30 Seconds): Pings all monitored services every 30 seconds for real-time uptime metrics.
- Quality Scanner (6 Hours): Runs Google PageSpeed (Lighthouse) audits on tenant target URLs every 21,600 seconds.
2. Consolidated Background Workers (deml_workers_start.py)
To minimize deployment footprint, operational complexity, and resource overhead, the Machine Learning Worker and Security Worker have been consolidated into a single background service (deml-workers). This unified runner spawns dedicated threads to execute background tasks concurrently and consumes scheduled event triggers from Redpanda:
- ML Worker Thread (
ml_workercommand):- Listens on
ml-training-eventsfor per-tenant (train_tenant) and platform-wide (train_all_tenants) actions. - Runs daily
train_all_modelscycles to retrain the PyTorch SLA and Threat models.
- Listens on
- Security Worker Thread (
security_workercommand):- Synchronizes AbuseIPDB and AlienVault OTX threat intelligence feeds hourly (
fetch_threat_intel). - Audits Data Encryption Key age (DEK rotation limit of 30 days) and runs compliance data purges (
db_cleanup). - Conducts OSINT dark web scans daily.
- Sweeps active Stripe subscriptions daily (
sync_subscriptions) to enforce pricing tiers. - Optimizes PostgreSQL storage nightly via
VACUUM ANALYZE.
- Synchronizes AbuseIPDB and AlienVault OTX threat intelligence feeds hourly (
4. Outbox Relay (outbox_relay.py)
- Event Publishing (5 Seconds): Polls unpublished
OutboxEventrows and publishes to Redpanda. Published rows are purged bydb_cleanupafter 30 days; exhausted retries (≥5 attempts) are purged after 7 days.