Network Traffic Enrichment and Cybersecurity Telemetry
Chapter 20: Network Traffic Enrichment and Cybersecurity Telemetry
In the modern threat landscape, simply logging raw IP addresses and standard HTTP metadata is insufficient for building a robust, cyber-aware platform. I must actively transform these opaque identifiers into actionable intelligence. To achieve this, I engineered a dedicated telemetry enrichment layer that intercepts all general traffic (Endpoints) and processes it through a series of specialized open-source tools before it reaches my database.
First, I utilize native regular expression parsing to dissect incoming User-Agent strings, accurately classifying the device_type (Mobile, Desktop, Tablet, Bot), os_name, and browser_name. Crucially, this allows me to reliably filter automated bot and crawler traffic out of my core SLA metrics, ensuring my latency distributions represent true human experiences.
Simultaneously, I leverage the native requests library against the ipwho.is API to perform deep reconnaissance on incoming IP addresses. This yields precise geographic location (City, Country), enabling me to correlate traffic spikes with regional events. More importantly, I extract the Autonomous System Number (asn) and Internet Service Provider (isp). This topological data is a game-changer for cybersecurity: it empowers my threat models to immediately distinguish between benign residential ISPs and data center ASNs (like AWS or DigitalOcean) which are frequently the source of volumetric attacks, scrapers, and malicious botnets.
By structurally integrating this rich metadata directly into my core Endpoints model, I unlock advanced anomaly detection capabilities. When combined with my Threat Intelligence feeds, this enriched context allows the platform to preemptively identify and rate-limit suspicious behavioral patterns long before they escalate into critical security incidents.
To fully weaponize this metadata, I engineered an Application-Level Zeek-equivalent middleware. This middleware sits at the Django edge, passively intercepting and logging all incoming HTTP request headers, source IPs, and processing latencies. Crucially, the middleware utilizes zero-latency cached domain mappings to instantly associate incoming traffic with its target Tenant UUID without blocking the main thread for database lookups. The platform explicitly dogfoods this architecture: it utilizes a post_migrate signal to bootstrap itself dynamically as Tenant0, ensuring all internal monitoring and background pipelines homogenize entirely around UUIDs and completely eliminate the vulnerability of hardcoded string constraints.