DNS anomaly detection: Defend against sophisticated malware
Not so long ago, the standard way of looking for a malware infection was to simply monitor web traffic. By looking, for example, for HTTP requests to google.com/webhp – a typical Internet connectivity check – we could easily pinpoint a ZeuS infected machine. Problem solved.
Sadly, cybercriminals use increasingly sophisticated methods of communication such as Domain Generation Algorithms (DGA) designed to evade detection in the growing noise of web traffic and to prevent the takedown of a botnet. DGAs are algorithms used by malware that generate domain names, which then serve as rendezvous points with their controllers. They are used as a method to restore communication when a controller is offline.
As cybercriminals change and improve their evasion techniques, monitoring capabilities also have to change and become more sophisticated. The focus in monitoring has always been on analyzing successful connections, whether it is an HTTP connection or an email. Now, we need to mine DNS traffic data to detect threats and pinpoint their sources. DNS monitoring takes us much further, providing information on failing attempts – the red flags of suspicious activity.
The good news is that since DNS is an essential component of the Internet, there is no way cybercriminals can get around it. Most activities that they engage in online will create DNS traffic. Most importantly, since their uses of DNS are atypical, this becomes a weakness that can be used against them.
Capturing and creating usable blocks of data
DNS traffic is rich in information. When captured correctly, it tells us what domain a computer attempts to connect with. In a typical situation, someone requests a specific domain name and it translates to an IP address. A successful request will create HTTP traffic towards that domain. But if a domain is entered incorrectly, the request will fail, generating an NXDOMAIN response.
Malicious DNS traffic does not follow this typical sequence. A malware infection will generate hundreds of requests for a domain at once; attempting to connect to its command and control (C&C) server by guessing which domain is controlled by the cybercriminals. This method essentially connects to a predetermined list of controllers and ultimately connects to the active one. This results in loads of noise, which is detectable. High volumes of NXDOMAIN responses are red flags for malware threats.
To avoid sending up these red flags, malicious software communicates with new domains intermittently to frustrate detection efforts. The random nature of it circumvents static timing analysis of traffic. This “agile” DNS method evades blacklists, the historical records of malicious domains that have been used in the past.
With every Internet transaction creating DNS traffic, monitoring is obviously not a small task. Normal DNS traffic typically generates about 12 NXDOMAIN’s per hour. At one client, we were able to detect and resolve an infection almost instantly when our DNS monitoring uncovered 400 NXDOMAIN’s per hour.
It is essential to utilize a sophisticated and comprehensive system to collect the DNS traffic that is captured through monitoring sensors. PassiveDNS aggregates duplicate traffic, keeping the logs small without losing the volume information. Most importantly, it keeps track of request and responses and splits the NXDOMAINS essential to DGA detection into a separate log. This dramatically reduces the amount of traffic to be analyzed, and allows focusing on the 10% of the traffic that fails.
Finding the source of malicious DNS traffic
While monitoring will detect a malware infection, an analysis of the data will lead to the source, and finding the infected host is always our goal. There are various tools and methods used to analyze DNS traffic for DGA patterns, and searching DNS logs for specific queries of known or suspected botnets. Proven analysis tools that focus only on failed DNS requests can quickly search for malicious domains and return only a low percent of false positives. When using these tools to focus on a specific data set, the DGA domains stick out like a sore thumb.
Another method for analyzing NXDOMAIN logs is searching for long domains. Legitimate domains are typically less than 12 characters long, and usually as short as possible in order to be memorable. A cybercriminal may direct his bots to use longer, illegitimate domains for communication, making them obvious and easier to find. For example, a 14-character domain made up of only consonants will be automatically flagged as malicious by the detection system.
The most well-known and widely spread malware is ZeuS; this malware family has infected millions of PCs. The typical ZeuS query is 33-character-long or more, and ends with .ru, .com, .biz, .info, .org or .net domain extensions.
In addition to analysis tools, there are specific methods that can be used to search through the NXDOMAIN logs. There are three domain characteristics that we look for:
Domain length – broken into 6 different length categories.
Character makeup – Alphanumeric, characters only and consonants only.
Top Level Domains (TLD) – 272 variations.
This method constantly looks for any combination of these three characteristics – a bit like a slot machine rotating its reels waiting to hit the jackpot.
Random DGA domains: