Summary of CrowdStrike Falcon Sensor Incident Report

4 min read

This article summarizes the root cause analysis (RCA) report on the incident involving CrowdStrike Falcon sensor's Channel File 291. The report provides in-depth information on the findings, technical details, mitigations, and the root cause of the issue that led to a significant system crash.

What Happened

The CrowdStrike Falcon sensor leverages advanced AI and machine learning models to safeguard customer systems. These models are continuously updated with threat intelligence and telemetry. The sensor's local graph store aggregates data and identifies Indicators of Attack (IOAs) using a combination of built-in and cloud-delivered Rapid Response Content.

In February 2024, sensor version 7.11 introduced a new Template Type for detecting attacks via Windows interprocess communication (IPC) mechanisms. This new Template Type required 21 input fields, but the integration code provided only 20. This discrepancy was not caught during initial testing due to wildcard matching for the 21st input.

On July 19, 2024, new IPC Template Instances with non-wildcard criteria for the 21st input were deployed, leading to an out-of-bounds memory read and subsequent system crashes. The incident was caused by a mismatch between the inputs validated and provided, an inherent out-of-bounds read issue, and the absence of non-wildcard criteria testing for the 21st input.

Findings and Mitigations

1. Validation of Input Fields at Compile Time

Findings: The IPC Template Type expected 21 inputs, but the code provided only 20, which went undetected during testing.

Mitigation: A patch was developed to validate the number of inputs at compile time, ensuring consistency. This patch went live on July 27, 2024.

2. Runtime Array Bounds Check

Findings: The Content Interpreter read beyond the input array due to a missing bounds check, causing crashes.

Mitigation: Bounds checks were added to the Content Interpreter on July 25, 2024. This fix prevents out-of-bounds reads and will be backported to all relevant sensor versions.

3. Expanded Testing Criteria

Findings: Testing did not include non-wildcard criteria for the 21st input, missing the out-of-bounds read issue.

Mitigation: Automated tests now include non-wildcard criteria for all fields. Future Template Types will have expanded test cases reflecting production scenarios.

4. Content Validator Logic Error

Findings: The Content Validator expected 21 inputs, leading to the release of problematic Template Instances.

Mitigation: Additional checks were added to ensure Template Instances do not exceed the number of provided inputs. This fix will be live by August 19, 2024.

5. Template Instance Testing within Content Interpreter

Findings: Initial Template Instances passed stress tests but did not account for mismatched input counts.

Mitigation: New procedures ensure every Template Instance undergoes thorough testing within the Content Interpreter before deployment.

6. Staged Deployment of Template Instances

Findings: Template Instances lacked staged rollouts, increasing the risk of widespread issues.

Mitigation: Staged deployments with canary testing and telemetry collection are now standard, enabling gradual rollouts and rollback capabilities if problems arise. Customers can now control the timing and scope of Rapid Response Content updates.

Independent Third-Party Review

CrowdStrike has engaged two independent security vendors to review the Falcon sensor code and the end-to-end quality process. These reviews focus on the impacted code and process improvements, ensuring enhanced security and quality assurance.

Technical Details

CrowdStrike delivers security content updates through Sensor Content and Rapid Response Content. The latter relies on several components, including the Content Interpreter, Template Types, and the Content Validator. The Rapid Response Content, interpreted via regex-based engines, is crucial for timely threat detection without altering sensor code.

Kernel Driver Usage in Security Products

CrowdStrike’s kernel driver provides deep visibility into system activities, critical for preventing and blocking malicious actions. The sensor's kernel-level operations, certified through the Windows Hardware Quality Labs (WHQL) program, enable robust security features.

Crash Dump Analysis

An analysis of a kernel crash dump revealed that accessing an out-of-bounds 21st input caused a memory fault. This detailed analysis highlights the root technical issue and supports the proposed mitigations to prevent similar incidents.

In case you have found a mistake in the text, please send a message to the author by selecting the mistake and pressing Ctrl-Enter.
Kelly 3K
I'm your source for the latest in tech news and updates. Stay informed with my articles on the most exciting developments in the tech world
You must be logged in to comment.

Sign In