July 10, 2024

eBPF – Who Watches the Watcher… and What is the Cost?

James Plouffe
Technical & Product Marketing @ BlueRock Systems

In the same way that Linux and containers have become the foundation of modern application development, Enhanced Berkeley Packet Filter (eBPF) has become the de facto technology for observability—and therefore security—for Linux and containers. Despite this prominent role, it’s worth remembering that eBPF was not specifically designed to solve observability or security problems. It was intended to solve the problem of extending kernel functionality while overcoming the limitations of existing approaches: modifying the kernel source or developing loadable kernel modules (LKM). The former is a lengthy and complicated process that results in new functionality only being available in the latest kernel(s), which may take years to reach “commodity” status, with no guarantee that the capabilities can or will be backported to earlier versions. The latter is also a challenge, since maintaining compatibility across different Linux distributions/kernel permutations requires that the LKM be matched to a base kernel (including minor and patch releases; e.g., to guarantee compatibility, an LKM built for kernel v6.7.4 should be rebuilt for v6.7.5). eBPF solves both problems by providing a standardized, backward compatible interface to load external code into the kernel. What that code does is up to the developer… or—potentially—an adversary.

How eBPF handles the security of programs

The security of eBPF is primarily achieved in two ways:

  1. Loading and running eBPF programs as a privileged user.
  2. The checking of eBPF programs at runtime by a verifier.

While the combination of these mechanisms may appear sufficient, they both have important limitations and caveats.

First, being a privileged user is not a hard requirement for loading eBPF programs, it is merely considered a best practice and it is the default configuration for most distributions. Even so, at least one popular distribution did not initially restrict this capability to privileged users and it remains possible for sysadmins to change the setting, meaning that there is risk of misconfiguration which may be difficult to identify in production environments. Additionally, the need for a privileged user may result in an additional identity and set of credentials that need to be protected (e.g., with secure storage and periodic rotation of authentication material).

Second, although the name “verifier” implies a comprehensive evaluation of the code, it uses algorithmic and static analysis techniques, rather than formal methods. In effect, developers of the verifier are required to imagine all the ways in which an eBPF program could misbehave, and then write some logic to identify and prevent that misbehavior. As general software quality practices have repeatedly demonstrated, this approach is necessary but not sufficient to prevent defects. Moreover, the goals of the verifier with regard to program behavior are surprisingly narrow: the verifier checks that an eBPF program will not crash the kernel, that it will terminate within some time limit, and that it does not attempt unauthorized memory access. So while the verifier provides some basic protection, that protection emphasizes stability and performance over other concerns. It should also be noted that—unlike LKMs—there is currently no signature checking for eBPF programs. One could be forgiven for coming away with the impression that eBPF programs are to Linux as macros are to Microsoft Excel… And wondering if there are any meaningful parallels (what could possibly go wrong?). It turns out that’s just part of the problem because, as we’ll see, the eBPF subsystem is prone to defects, some of which have enabled privilege escalation and execution of malicious code.

The security of eBPF itself

Like any software feature, there is always a possibility that bugs will manifest as security vulnerabilities and eBPF is no exception. A cursory search of Linux CVE announcements revealed 10 recently-fixed vulnerabilities in eBPF. From a historical perspective, one of the most significant vulnerabilities, CVE-2021-31440 (a.k.a. ZDI-21-503/ZDI-CAN-13661), enabled attackers to escalate privileges and execute arbitrary code in kernel space because the verifier did not “perform proper validation of user-supplied eBPF programs prior to executing them”. Another example is CVE-2022-2322 which allowed “a kernel information leak”. The most recent example is CVE-2023-2163 the effects of which were “unsafe code paths being incorrectly marked as safe”. In other words, there are multiple cases of the verifier failing in its essential role.

This last vulnerability is also significant because it was the first vulnerability identified through the use of Buzzer, an eBPF fuzzing toolchain publicly released by Google in mid-2023. Given the prominence of eBPF in cloud-native infrastructure and apps, it certainly warrants additional scrutiny. The availability of better tooling can definitely help kernel developers and maintainers identify and eliminate bugs that could threaten the integrity of eBPF but that same tooling also relieves some of the burden of vulnerability discovery—and therefore exploit development—from adversaries.

The efficacy of eBPF-based security tools

The ability to use eBPF to create arbitrary instrumentation is what makes it so attractive from an observability and security perspective. Even so, the use of eBPF for security monitoring feels very much like the early days Endpoint Detection and Response (EDR). The first EDR products could collect all kinds of new telemetry but before ML and AI (or, for the more level-headed practitioners, linear regression and logistic regression) became the “norm” in endpoint security, the onus was—and to a large extent still is—on defenders, including SOC analysts, incident responders, and the emerging job category of detection engineers to identify malicious activity. The efficacy of early EDR deployments depended in large measure on what was observed and the conclusions that were drawn about it. The same is true for eBPF. Early EDR implementations and eBPF are also similar with regard to “Response”: the response is generally not automated. The emphasis is primarily on generating telemetry for a human to act on, whether that is proactive containment and remediation or simply reviewing forensics as part of the investigation, because the available number of remedial actions that can be performed is limited.

Unlike the early days of EDR which focused almost exclusively on the Windows endpoint monoculture, however, the permutations for modern workloads are almost incalculable: things that are (or seem) normal in one environment may be unambiguously bad in another. The diverse nature of modern applications makes it harder to identify “invariants” that apply in multiple contexts. As a result, the difficulty in choosing what to instrument and the complexity of the detection logic increase considerably. Although many eBPF-based security solutions include a stock set of rules out-of-the-box, they will usually require experimentation and tuning which can increase the load on already overburdened security teams. These tools may also need to be incorporated during the development pipeline, which could risk the implementation being done incorrectly or not at all, especially considering that—unlike general-purpose observability tools—security products do not necessarily provide an immediate, tangible benefit directly to developers.

Conclusion

In considering eBPF-based security solutions, it’s critical to account for the additional attack surface that eBPF may expose, as well as the additional strain on existing staff and processes that eBPF-based tooling may create. It’s also worth factoring in the ways that eBPF can be circumvented or evaded even without specific vulnerabilities. Although eBPF is very powerful, like virtually all tools, it is not without a certain amount of danger and the results it produces can vary widely depending on how it is used. Ultimately, the biggest shortcoming of eBPF may be its circular logic: it is expected to make guarantees about the integrity of an environment, but those very guarantees depend on the integrity of the environment not being compromised.

Subscribe to our newsletter

By clicking Sign Up you're confirming that you agree with our Terms and Conditions.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.