Redis or Not – Revealing a Critical Vulnerability in Argo CD Kubernetes Controller

Age
8 months ago
Threat Information
Summary

A critical vulnerability, CVE-2024-31989, has been discovered in the Argo CD Kubernetes controller, which has a severity score of 9.1. This vulnerability affects Kubernetes clusters using Argo CD, a GitOps continuous delivery tool. The flaw allows attackers to exploit the Argo CD server's elevated permissions, enabling them to escalate privileges and potentially take control of the entire Kubernetes cluster. The attack manipulates the data in Argo CD’s Redis caching server, which, by default, lacks password protection and can be accessed by any pod within the cluster. Attackers can alter application state manifests stored in the Redis server, recalculating the verification hash without a secret key, thereby tricking Argo CD into accepting malicious updates. To mitigate this vulnerability, users should update Argo CD to the latest versions (2.11.1, 2.10.10, 2.9.15, and 2.8.19), ensure the network policy "argocd-redis-network-policy" is enabled, and use controller-based secrets management tools. The vulnerability was responsibly disclosed, and patches have been released to address the issue.

How BlueRock Helps

The attack on the Argo CD Kubernetes controller begins with the exploitation of elevated permissions, allowing the attacker to manipulate the Redis caching server, which lacks password protection. This manipulation enables the attacker to alter application state manifests and trick Argo CD into accepting malicious updates. As the attacker deploys malicious pods with high privileges, BlueRock's Container Capability Control mechanism ensures that only containers with specified capabilities are allowed to run, effectively preventing the deployment of these potentially harmful containers. Furthermore, the attacker attempts to execute code on the host node by deploying a privileged pod. Here, BlueRock's Reverse Shell Protection mechanism comes into play by preventing unauthorized attempts to bind shell input and output streams to network sockets, thereby mitigating the risk of reverse shell attacks. These mechanisms collectively safeguard the Kubernetes environment from unauthorized container deployments and malicious script executions, maintaining the integrity and security of the cluster.

MITRE ATT&CK Techniques Inferred
  • T1068: Exploitation for Privilege Escalation: The article mentions that the attacker exploits the elevated permissions of the Argo CD server to escalate their privileges and potentially take control of the Kubernetes cluster. This indicates the use of privilege escalation techniques.
  • T1600.001: Weaken Encryption: Reduce Key Space: The attacker manipulates the data stored in Argo CD’s Redis caching server, which lacks password protection. This involves altering application state manifests stored in the Redis server to exploit the system. This describes a manipulation of application state and configuration data.
  • T1059: Command and Scripting Interpreter: The attacker uses the Argo CD server’s elevated permissions to deploy malicious pods with high privileges, which can execute code on the host node and access sensitive information. This demonstrates the use of deploying malicious code or scripts.
  • T1114.002: Email Collection: Remote Email Collection: The attacker recalculates the hash for the manipulated manifest without a secret key, allowing them to modify the manifest data and present it as legitimate. This is indicative of the use of hashing algorithms to bypass integrity checks.
  • T1610: Deploy Container: The attacker creates a low-privilege pod in another namespace to simulate a compromised pod within the cluster. This is a form of creating and using a container for malicious purposes.
  • T1560.002: Archive Collected Data: Archive via Library: The attacker uses a Go program to decompress gzipped contents in the Redis server to uncover cached information about the applications managed by Argo CD, including their manifests and details about the Kubernetes cluster. This involves extracting and analyzing compressed data.
  • T1057: Process Discovery: The attacker uses Redis profiler to observe interactions between pods and the Redis server to understand how application behavior is affected. This is a form of gathering information through monitoring and profiling.
  • T1535: Unused/Unsupported Cloud Regions: The attacker modifies the application manifest in Redis and successfully deploys a privileged pod, which allows them to escalate privileges and execute code on the host node. This involves the manipulation of cloud infrastructure.
  • T1021.004: Remote Services: SSH: The attacker adds their public SSH key to the node’s authorized keys, allowing them to connect to the node with an SSH shell. This is indicative of the use of SSH for remote access.

Fact-Based Attack Chains
F1: Exploitation of CVE-2024-31989 by manipulating Argo CD's Redis cache to deploy a privileged pod and gain host access, assuming default insecure Redis configuration.
  • Attacker gains initial access to a low-privilege pod within the Kubernetes cluster where Argo CD is deployed. (Cited from: "we created a low-privilege pod in another namespace, simulating a compromised pod within the cluster. It could potentially be a webshell, a malicious package, or any other low-access point to the cluster.")
  • Attacker verifies if Argo CD's Redis instance is accessible, potentially checking for the default lack of password protection and the absence or misconfiguration of network policies intended to restrict access. (Cited from: "Redis caching server, which, by default, lacks password protection and can be accessed by any pod within the cluster if proper network policies are not in place.", "Verify that the Network Policy rule named 'argocd-redis-network-policy' is present and enabled")
  • From the compromised pod, the attacker resolves the Argo CD Redis service address (e.g., using kube-DNS) and connects to it on port 6379. (Cited from: "We resolved the Redis server's address using the local kube-DNS server and attempted to connect to it on port 6379. Surprisingly, it worked! Knowing that the Redis server doesn't require a password by default...")
  • BR-87: Process Socket Deny - This mechanism is applicable because it can prevent the compromised pod's process (e.g., shell, script, custom tool) from initiating outgoing network connections to the Redis server if a policy denies socket operations for that specific process name or path.
  • Attacker identifies keys storing cached application manifests within Redis, potentially using Redis commands or observing traffic (like the researchers using the profiler). The content is found to be compressed (gzip). (Cited from: "We found many keys that seem to reflect data about our deployed application.", "When we tried to view their content, it appeared compressed, likely in gzip format.", "We utilized Redis’s profiler to analyze the server’s events as we observed that the Application Controller pod is accessing a key named ‘mfst|app.kubernetes.io/instance|myapp|42…|1.8.3.gz’.")
  • BR-87: Process Socket Deny - This mechanism is applicable because the process used to interact with Redis (e.g., redis-cli, a script) requires network socket access. If this process is denied socket operations by policy, the attacker cannot list or observe keys.
  • Attacker develops or uses a tool to decompress the gzip data to read the cached manifest content. (Cited from: "Using the Argo CD source code as a reference, we developed a Go program to decompress the gzipped contents and uncover their values.")
  • Attacker crafts a malicious Kubernetes manifest (e.g., deploying a privileged pod using configurations like those from the BadPods project) designed to escalate privileges or gain host access. (Cited from: "As a proof of concept, we deployed a privileged pod using a configuration from BadPods project on GitHub.")
  • Attacker modifies the target manifest data within the identified Redis key, replacing the legitimate manifest content with their malicious version.
  • BR-87: Process Socket Deny - This mechanism is applicable because writing the modified data back to Redis requires network socket access. If the process performing the write is denied socket operations, this step fails.
  • Attacker observes that the Argo CD Repo Server initially reverts the change because the cacheEntryHash no longer matches the content. (Cited from: "However, once we changed the manifest, the Argo CD Repo Server quickly reverted the change, restoring the original value.", "Upon re-examining the manifest... we noticed the 'cacheEntryHash' key...")
  • BR-87: Process Socket Deny - This mechanism is applicable because observing the revert likely involves reading from Redis again. If the process performing the read is denied socket operations, this observation might fail or be delayed.
  • Attacker replicates Argo CD's hashing algorithm (fnv64a) to calculate the correct cacheEntryHash for the manipulated manifest data. (Cited from: "According to the source code, this hash is intended to validate the manifest. However, it is generated without any private secret signing it.", "We developed a program to replicate the fnv64a hashing algorithm in order to regenerate the checksum hash for our manipulated manifest.")
  • Attacker updates both the manifest content and its corresponding cacheEntryHash value in Redis. (Cited from: "Afterward, the modification to the manifest in Redis was successful.")
  • BR-87: Process Socket Deny - This mechanism is applicable because writing both the manifest and the hash to Redis requires network socket access. If the process performing the write is denied socket operations, this step fails.
  • The Argo CD Application Controller reads the manipulated manifest and hash from Redis, accepts it as a valid cache entry, detects the application is 'out of sync' with this desired (malicious) state, and applies the changes to the Kubernetes cluster. (Cited from: "This breach leads Argo CD Server to accept the altered manifest as a valid cache entry, unintentionally triggering an unauthorized update to the cluster’s state.", "During the next state validation by the Application Controller, it detected that it was ‘out of sync’ and deployed our changes to the Kubernetes application.")
  • The attacker's malicious pod (e.g., privileged, mounting host filesystem) is deployed to the cluster. (Cited from: "By exploiting this flaw, attackers can change the application manifest to deploy malicious pods with high privileges...", "The newly deployed pod possesses elevated capabilities. It also mounts the host node’s filesystem to our pod.")
  • BR-47: Container Capability Control - This mechanism is applicable because it allows defining an allowlist of capabilities for containers, overriding requests in the manifest. If the malicious pod requests excessive capabilities (e.g., SYS_ADMIN, CAP_NET_ADMIN) not permitted by the BR-47 policy, the pod creation or functionality will be restricted, preventing the intended privilege escalation.
  • BR-67: Container Root User Control - This mechanism is applicable because it detects and prevents processes (other than the container's init process) from running as root within the container. If the attacker's pod attempts to run subsequent processes as root, BR-67 can block these actions.
  • BR-66: Host FS Mount Control - This mechanism is applicable because it enforces an allowlist for host filesystem paths that can be mounted into containers. If the malicious manifest specifies mounting sensitive host directories (like /, /etc, /root, /var/run/docker.sock) that are not on the BR-66 allowlist, the mount operation during pod initialization will fail, hindering host access.
  • BR-57: Cluster Drift Protection - While Argo CD uses the K8s API, this mechanism aims to prevent unauthorized pod deployments. In this case, the deployment is authorized by Argo CD, albeit based on manipulated data. So, BR-57 is less directly applicable to preventing this specific deployment step but is relevant to the overall goal of controlling pod creation.
  • BR-61: Container Runtime Socket Protection - This mechanism is applicable if the attacker's ultimate goal after compromising the pod involves directly interacting with the container runtime socket (e.g., containerd.sock) from an unauthorized process lineage (not kubelet). It wouldn't prevent the initial deployment by Argo CD but could block subsequent container escape/manipulation attempts via the runtime socket.
  • BR-78: Host Setuid File Protection - This mechanism is applicable if the deployed pod, having mounted the host filesystem, attempts to write to a setuid file owned by the host root (e.g., to modify sudo for persistence). BR-78 would block such write attempts.
  • Attacker uses the privileged access granted by the malicious pod to achieve objectives like executing code on the host node (e.g., adding their SSH key to authorized_keys). (Cited from: "execute code on the host node", "We decided to add our public SSH key to the node’s authorized keys, which allows us to connect to the node with an SSH shell.")
  • BR-55: Reverse Shell Protection - This mechanism is applicable if the attacker, having gained execution within the malicious pod or on the host, attempts to establish an interactive command and control channel by binding a shell's standard input/output/error streams to a network socket. BR-55 detects and blocks this specific behavior.
  • BR-88: Process Path Exec Allow - This mechanism is applicable because if the attacker downloads or creates tools/scripts on the host node (e.g., in /tmp or /var/tmp) and attempts to execute them, BR-88 will block the execution if the path is not in the configured allowlist (e.g., /bin, /usr/bin).
  • BR-90: Process Exec Deny - This mechanism is applicable because if the attacker attempts to execute common network or recon tools like nc, wget, or curl (from any path ending in /nc, /wget, /curl by default), BR-90 will block their execution.
  • BR-62: Linux/Host Drift Protection - This mechanism is applicable because if the attacker introduces new executable files onto the host filesystem (not via a trusted package manager) and attempts to run them, BR-62 will detect this drift from the baseline and block execution.
  • BR-65: Container Host Drift Prevention - This mechanism is applicable similarly to BR-62, focusing on ensuring only allow-listed privileged containers or host processes can execute new/modified host files added post-boot.
  • BR-54: Container Drift Protection (Binaries & Scripts) - This mechanism is applicable if the attacker attempts to execute a binary or script inside the malicious container that was not part of its original image. BR-54 maintains a manifest of original executables and blocks execution of any new ones.
  • BR-75: Critical Directory Write Protection - This mechanism is applicable if the attacker attempts to write to critical host directories defined in policy (e.g., modifying system configurations, adding persistent services). Adding an SSH key to /root/.ssh/authorized_keys could be blocked if /root/.ssh is protected.
  • BR-91: Sensitive File Access - This mechanism is applicable because it monitors and can block access attempts to predefined sensitive files or patterns. Accessing or modifying /root/.ssh/authorized_keys or /etc/shadow would trigger this protection if configured.
  • BR-86: PTrace Protection - This mechanism is applicable if the attacker uses the ptrace system call for process injection, memory inspection/tampering, or debugging security tools as part of their post-exploitation activity on the host.
  • BR-53: SSH Deep Auth & SSH Least Privilege - While not blocking the key addition directly, if the cluster uses BR-53 for SSH authentication, simply adding a key to authorized_keys might not grant the expected level of access if ephemeral certificates and IdP tokens are required for login, thus mitigating the impact.
  • The attacker can potentially remove evidence of the attack, possibly by manipulating the Redis cache again or using their host access. (Cited from: "Moreover, the attack allows for the removal of all evidence.")
  • BR-75: Critical Directory Write Protection - This mechanism is applicable because it can prevent the attacker from deleting or modifying log files located within critical directories defined in the policy.
  • BR-91: Sensitive File Access - This mechanism is applicable because it can block attempts to read or write to specific log files (like /var/log/auth.log, audit logs) if they are configured as sensitive.
  • BR-88: Process Path Exec Allow / BR-90: Process Exec Deny - These mechanisms are applicable if the attacker uses specific tools or scripts (e.g., custom log wiping tools, standard utilities like shred or rm) to remove evidence. Execution could be blocked if the tool is run from a disallowed path (BR-88) or if the executable name itself is denied (BR-90).
F2: Accessing potentially sensitive information (including Kubernetes Secrets) stored in Argo CD's Redis cache due to insecure defaults.
  • Attacker gains initial access to a low-privilege pod within the Kubernetes cluster where Argo CD is deployed. (Cited from: "we created a low-privilege pod in another namespace, simulating a compromised pod within the cluster...")
  • Attacker connects to the Argo CD Redis server, exploiting the default lack of password and potentially missing NetworkPolicies. (Cited from: "Redis caching server, which, by default, lacks password protection and can be accessed by any pod within the cluster if proper network policies are not in place.", "We resolved the Redis server's address... and attempted to connect to it... Surprisingly, it worked!")
  • BR-87: Process Socket Deny - This mechanism is applicable because it can prevent the compromised pod's process from initiating the network connection to the Redis server if that process is on the deny list for socket operations.
  • Attacker understands that secrets might be present in the Redis cache, either because users committed manifests containing secrets or because Argo CD plugins generated manifests with secrets. (Cited from: "We also observed an interesting Secret Management page on the documentation website, stating the following: ... There are two possibilities where secrets can end up in the cache: Users commit manifests with secrets or Argo CD Admins configure plugins to generate manifests with secrets.")
  • Attacker lists keys in Redis and identifies keys likely containing application manifests or related cached data. (Cited from: "We found many keys that seem to reflect data about our deployed application.")
  • BR-87: Process Socket Deny - This mechanism is applicable because listing keys requires network communication with Redis. If the process used (e.g., redis-cli) is denied socket access, this step fails.
  • Attacker retrieves the content of these keys, noting it is compressed (gzip). (Cited from: "When we tried to view their content, it appeared compressed, likely in gzip format.")
  • BR-87: Process Socket Deny - This mechanism is applicable because retrieving key content requires network communication. If the process used is denied socket access, this step fails.
  • Attacker uses a tool (like the researchers' Go program) to decompress the gzip data. (Cited from: "we developed a Go program to decompress the gzipped contents and uncover their values.")
  • Attacker parses the decompressed data (cached manifests, etc.) and extracts any sensitive information found, such as Kubernetes Secrets, API keys, or other credentials. (Cited from: "This confirms that any pod in the cluster could potentially access the Redis server and secrets.", "access sensitive information, including Kubernetes Secrets.")
  • BR-41: Container Memory Namespace Isolation / BR-68: Spaces-based Strong Isolation - These mechanisms provide memory isolation between namespaces/processes. While they don't directly prevent accessing data over the network from Redis, they would prevent an attacker in one namespace from directly reading the memory of the Argo CD controller process or the Redis process in another namespace, limiting alternative paths to the same data.

See Blue Rock In Action