Traditionally malware used to have hard-coded domain names or IP addresses of C&C server in the malware binary to connect directly with the C&C server. However, malware analyst can easily discover those hard-coded domain names or IP addresses by reverse engineering and can blacklist them. Domain Generation Algorithm (DGA) is a technique employed by the malware authors to prevent takedowns or blacklisting attempts of the C&C domains. DGA is used to generate a large number of domain names for the C&C server. Constant changing of the domain name for the C&C server through the implementation of DGA is known as Domain-Fluxing. Some of the notable DGAs based malware include Zeus GameOver, Cryptolocker, PushDo, Conficker and Ramdo.
Domain Generation Algorithm (DGA):
Domain Generation Algorithm (DGA) is a technique that adversary embeds in the malware binary to periodically generate a large number of pseudo-random non-existent domain names for the Command-and-Control (C&C) server. The malware then attempts to resolve these generated domain names by sending DNS queries until one of the domains resolves to the IP address of a C&C server. DGA generated domain names function as rendezvous points for the malware and its C&C server. Domain names generated by the DGA is also known as Algorithmically Generated Domains (AGD). DGA is employed to prevent the C&C server from being taken down and hinder blacklisting attempts.
Having the DGA algorithm and knowing the DGA seed enables the adversary to predict DGA domains in advance. As a result, the adversary can also generate exactly the same list of domain names that malware can generate. Knowing the DGA’s seed and algorithm allows the adversary to predict which domain names the infected machines (malware) will attempt to query at a certain date and time, and then the adversary registers one of the domain names expected to be generated by DGA embedded malware in advance.
DGA malware periodically generates a large number of candidate domain names for the C&C server and query all of these algorithmically-generated domains (AGD) in order to resolve the IP address of the C&C server. The adversary registers one of those DGA created domain names for the C&C Server in advance using the same algorithm embedded in the malware. Eventually, malware queries the adversary’s pre-registered domain name and resolves the IP address of the C&C server. Then the malware starts communicating with the C&C server and receives new commands and updates. If the malware cannot find the C&C server at its previous domain name, it queries to the next set of DGA generated domain names until it finds one that works. Adversary registers the domain name 1 hour prior to an attack and disposes the domain name within 24 hours.
How does Domain Generation Algorithm (DGA) work?
The seed is the base element of DGA and serves as a shared secret between adversary and the malware. The seed is an aggregated set of parameters given by the adversary for generating pseudo-random domain name which is the main requirement of Domain Generation Algorithm (DGA).The seed is accessible to both the adversary and the malware.
- The Seed (Shared Secret)
- Static Seed
- Dynamic Seed
The seed is required for the calculation of Algorithmically-Generated Domains (AGDs). Domain Generation Algorithm (DGA) takes seed value as input parameter to generate pseudo-random strings and algorithmically appends TLD (.com, .org, .ca, .ru) with the string to output possible domain names.
Algorithmically-Generated Domain (AGD)
The static seed could be a dictionary of word, concatenation of random strings and numbers or anything that adversary can modify at will. Dynamic seeds are time dependent, the seed changes with time. Daily trending twitter hashtag, insignificant digits of foreign exchange rate, weather temperature can be also leveraged as dynamic seed value. Often current date and time is used as seed value in DGA to generate domain names. The static and dynamic seed elements are combined together in an algorithm to generate the pseudo-random strings then TLD such as .com, .ru, .ca is appended with the strings to make domain names.
Adversary employs Domain Generation Algorithms (DGA) to generate a large number of pseudo-random domain names and select a small portion for actual C&C use. DGA provides a remarkable level of agility and resilience to the adversary’s C&C infrastructure and makes it harder to take down the C&C server. If the C&C domain names or IP addresses are identified and taken down, the malware will eventually get the IP address of the relocated C&C server via DNS queries to the next set of DGA generated domain names. Even some malware employs DGA and Fast-Flux technique concurrently to protect the C&C server, which makes significantly difficult to detect and take down the C&C server.
GitHub repository that contains DGA code:
DGA malware reverse engineering: